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I.  Summary  of  Research  Accomplishments 

>  This  project  was  concerned  with  simulated  annealing,  a  Monte  Carlo  method  for  obtaining  glo¬ 
bally  optimal  or  nearly  globally  optimal  solutions  to  a  variety  of  optimization  problems.  -  '  - 

We  have  achieved  key  results  in  two  main  areas: 

i)  characterizing  the  cooling  rate  necessary  and  sufficient  for  simulated  annealing  to  hit  the  global 
minimum,  and 

ii)  obtaining  a  novel  upperbound  for  the  time-constant  of  convergence  of  simulated  annealing  at  a 
fixed  temperature  to  its  equilibrium  distribution  and  studying  the  growth  of  this  bound  as  the  tem¬ 
perature  approaches  zero  asymptotically. 


Simulated  annealing  with  a  time  varying  temperature  gives  rise  to  a  time  inhomogeneous  Markov 
chain.  This  Markov  chain  is  difficult  to  analyze  and  study  due  to  the  time-inhomogeneity.  We  have 
been  able  to  obtain  a  novel  theory  for  analyzing  such  processes.  We  have  introduced  a  notion  of 
“recurrence  order”  associated  with  each  state  of  a  Markov  chain.  Essentially,  this  recurrence  order 
characterizes  the  rate  of  convergence  of  the  occupation  probability  for  the  state.  Our  central  result  con¬ 
sists  of  the  discovery  that  these  recurrence  orders  satisfy  a  “balance  equation”  across  every  cutset  of 
the  graph  of  the  chain  for  the  general  class  of  such  Markov  chains.  For  the  special  case  of  simulated 
annealing  with  symmetric  neighborhoods,  they  even  satisfy  a  “detailed  balance”  for  every  pair  of 
states. 


These  balance  equations  transform  the  analytical  problem  of  determining  the  asymptotic  behavior 
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of  simulated  annealing  into  a  purely  algebraic  problem  of  solving  the  balance  equations.  We  have  ~ 

obtained  graph  theoretic  algorithms  for  solving  such  balance  equations  in  general,  and  for  simulated 

n _ 

annealing  in  particular  have  obtained  explicit  solutions.  - - 


From  the  explicit  solutions  to  the  balance  equations  we  have  been  able  to  characterize,  i.e.,  obtain  f  _ 
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the  global  minimum  of  the  optimization  problem  with  probability  one. 

The  above  results  are  detailed  in  [1-3]  of  the  attached  list  of  Publications. 

The  behavior  of  simulated  annealing  at  a  fixed  temperature  can  be  modeled  by  a  reversible  time- 
homogeneous  Markov  chain  converging  to  an  equilibrium  distribution  at  that  temperature.  As  the  tem¬ 
perature  goes  to  zero  asymptotically,  the  equilibrium  distributions  themselves  converge  to  the  optimal 
distribution.  In  [4],  we  have  obtained  a  novel  upper  bound  for  the  second  largest  eigenvalue  of  a  finite 
reversible  time-homogeneous  Markov  chain  as  a  function  of  three  parameters,  namely,  the  smallest 
transition  probability,  the  underlying  structure  of  the  chain,  and  the  skewness  of  the  equilibrium  distri¬ 
bution.  This  eigenvalue  bound  enables  us  to  bound  the  time-constant  of  convergence  of  a  reversible 
Markov  chain  to  its  equilibrium  distribution.  In  particular,  we  can  bound  the  time  constant  of  conver¬ 
gence  of  a  fixed-temperature  simulated  annealing  algorithm  solving  a  particular  instance  if  an  optimi¬ 
zation  problem.  Moreover,  we  can  study  the  growth  of  this  bound  as  the  temperature  approaches  zero 
or  skewness  becomes  arbitrarily  large;  thereby,  providing  a  fairly  good  understanding  of  the  tempera¬ 
ture  asymptotics  of  the  simulated  annealing  algorithm.  We  exhibit  a  class  of  Markov  chains  on  which 
our  bound,  treated  as  a  function  of  skewness  alone,  is  asymptotically  tighter  than  previously  established 
bounds  based  on  a  certain  parameter  known  as  the  conductance  of  the  Markov  chain.  We  also  show 
that  our  bound  is,  in  general,  much  easier  to  compute  for  simulated  annealing  chains. 

More  recently,  we  have  achieved  what  we  believe  to  be  a  significant  breakthrough  in  understand¬ 
ing  the  size-asymptotics  of  a  time-homogeneous  simulated  anne,  hug  chain  solving  a  particular  com¬ 
binatorial  optimization  problem  known  as  the  Integer  Knapsack  problem.  For  this  NP-Hard  problem, 
we  have  been  able  to  derive  sufficient  conditions  under  which  the  time-constant  of  convergence  of  a 
fixed-temperature  simulated  annealing  chain  is  a  polynomial  in  the  size  of  the  problem.  Combining 
this  with  an  in-depth  study  of  cost  distributions  and  density  of  states,  we  have  shown  that  for  certain 
versions  of  the  Integer  Knapsack  problem,  a  fixed-temperature  simulated  annealing  algorithm  can  find  a 


state  with  cost  sufficiently  close  to  the  global  minimum  in  polynomial  time  with  overwhelming  proba¬ 
bility.  The  manuscript  containing  these  results  is  still  under  preparation  and  will  be  made  available  as 
soon  as  it  is  ready  for  publication. 
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SIMULATED  ANNEALING  TYPE  MARKOV  CHAINS  AND  THEIR  ORDER 

BALANCE  EQUATIONS* 

DANIEL  P.  CONNORS^  and  P.  R.  KUMARi 


Abstract.  Generalized  simulated-annealing  type  Markov  chains  where  the  transition  probabilities  are 
proportional  to  powers  of  a  vanishing  small  parameter  are  considered.  An  “order  of  recurrence,”  which 
quantifies  the  asymptotic  behavior  of  the  state  occupation  probability,  is  associated  with  each  state.  These 
orders  of  recurrence  satisfy  a  fundamental  balance  equation  across  every  edge-cut  in  the  graph  of  the  Markov 
chain.  Moreover,  the  Markov  chain  converges  in  a  Cesaro-sense  to  the  set  of  states  having  the  largest 
recurrence  orders.  These  results  convert  the  analytic  problem  of  determining  the  asymptotic  properties  of 
the  time-inhomogeneous  stochastic  process  into  a  purely  algebraic  problem  of  solving  the  balance  equations 
to  determine  the  recurrence  orders. 

Graph  theoretic  algorithms  are  provided  to  determine  the  solutions  of  the  balance  equations.  By  applying 
these  results  to  the  problem  of  optimization  by  simulated  annealing,  it  is  shown  that  the  sum  of  the  recurrence 
order  and  the  cost  is  a  constant  for  all  states  in  a  certain  connected  set,  whenever  a  “weak-reversibility” 
condition  is  satisfied.  This  allows  the  necessary  and  sufficient  condition  for  the  optimization  algorithm  to 
hit  the  global  minimum  with  probability  one  to  be  obtained. 

Key  words,  simulated  annealing,  optimization,  Markov  chains 

A\IS<MOS)  subject  classifications.  60JIO,  90C27 

1.  Introduction.  We  consider  finite  state  Markov  chains  {.v(l)f  with  transition 
probabilities  of  the  type 

/>„('*  =  c„f(/)\ 

where  fin  is  a  small  parameter  converging  to  zero.  In  a  previous  paper  [7]  we  have 
shown  that  if  we  define  "orders  of  recurrence"  by  (more  precise  definitions  are  given 
in  §  2) 

13,  sup  |  c  5  0:  X  eitVn.U)  =  +xj, 

then 

(i)  These  recurrence  orders  satisfy  a  balance  equation,  max,.  -v  (/3,  —  V„)  = 
max,  y,  (/3,  -  V„),  for  every  subset  A;  and 

(ii)  The  Markov  process  converges  to  the  set  of  states  with  the  largest  orders  of 
recurrence. 

This  provides  a  novel  approach  to  analyzing  the  asymptotic  behavior  of  such 
time-inhomogeneous  Markov  processes.  Specifically,  we  use  ( i )  to  solve  the  balance 
equations,  and  then  (ii)  provides  the  limiting  behavior.  Moreover,  the  orders  of 
recurrence  also  provide  information  about  the  rates  of  convergence  of  the  state 
occupation  probabilities.  This  approach  via  recurrence  orders  therefore  converts  the 
analytic  problem  of  determining  the  asymptotic  behavior  of  the  time-inhomogeneous 
process  into  a  purely  algebraic  problem  of  solving  the  balance  equations. 
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University  of  Illinois,  M01  W.  Springfield  Avenue,  Urbana,  Illinois  61801. 
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A  significant  motivation  for  studying  such  Markov  chains  lies  in  the  fact  that  in 
the  method  of  optimization  by  simulated  annealing,  if  {  W^}  is  the  cost  function  whose 
minimum  is  sought,  then  we  obtain  a  Markov  chain  with 

P„(i)  =  C„f  ( t )  ' 

Thus  simulated  annealing  is  a  special  case  where  the  powers  V„  satisfy 

V„  :=  max  (0,  VV;  —  W,), 


for  some  {  W, 

To  pursue  the  above  approach  to  analyzing  such  time-inhomogeneous  Markov 
chains,  it  is  necessary  to  be  able  to  solve  the  balance  equations.  However,  there  can 
be  nonunique  solutions  to  the  balance  equations.  We  present  graph-theoretic  circulation 
based  algorithms  to  obtain  a  solution,  as  well  as  all  solutions,  to  the  balance  equations. 
We  show  by  an  example  the  interesting  phenomenon  that  such  nonuniqueness  can 
arise  when  the  asymptotic  properties  of  the  Markov  process,  and  the  recurrence  orders, 
depend  not  just  on  the  exponents  V„,  but  also  on  the  proportionality  constants  c,r 

By  applying  these  results  to  the  Markov  chain  arising  from  the  method  of  optimiz¬ 
ation  by  simulated  annealing  when  the  "weak  reversibility"  condition  of  Hajek  [1] 
holds,  we  show  that  the  sum  of  the  recurrence  order  and  the  cost  is  a  constant  on  sets 
connected  by  recurrent  arcs.  This  allows  us  to  obtain  the  necessary  and  sufficient 
condition  for  the  optimization  algorithm  to  hit  the  global  minimum  with  probability 
one.  Our  necessity  result  is  a  stronger  sample  path  result  than  is  found  in  [1]  or  [2], 

Background.  Tsitsiklis  [2]  has  also  investigated  Markov  chains  with  transition 
probabilities  proportional  to  powers  of  a  small  time-varying  parameter.  His  analysis 
was  based  on  observing  that  due  to  the  slow  variation  of  jf(r)},  we  can  employ  bounds 
on  the  state  occupation  probabilities  for  stationary  Markov  chains,  where  fit)  is  held 
constant,  to  obtain  hounds  for  the  time-inhomogeneous  case.  His  approach  is  quite 
different  from  ours. 

Based  on  an  analogy  to  the  physical  process  of  annealing,  the  sequence  fit)  is 
called  the  "cooling  schedule,"  and  just  as  in  the  physical  analogy  it  plays  a  key  role 
in  determining  asymptotic  behavior.  It  has  been  shown  by  Geman  and  Geman  [3], 
Mitra,  Romeo,  and  Sangiovanni-Vincentelli  [4],  andGidas  [5],  that  simulated  annealing 
converges  in  probability  to  a  minimum  of  the  optimization  problem  provided 
X,'  „fU)',  =  +x  for  large  enough  p.  Hajek  [I]  has  determined  the  necessary  and 
sufficient  conditions  on  the  value  of  p  for  the  algorithm  to  converge  in  probability  to 
the  minimum  when  a  "weak  reversibility"  assumption  is  satisfied. 

2.  Orders  of  recurrence  and  balance  equations.  Consider  a  Markov  chain  over  a 
finite  state  space  X  whose  transition  probabilities  are  proportional  to  powers  of  a 
vanishing  time  varying  parameter  fit);  that  is,  the  transition  probabilities  p, ,(!)'■= 
Pr  ( .v(  t  +  I )  -j  |  ,v(  t )  =  i )  are  given  by 

( I )  p„(  / )  =  f,,f  ( f  1 1 1  for  all  i,jc  X,  i  ^  j,  and  t  e  J *.  and  p„{  t )  =  1  —  £  pit(  r) 


where 

( 2 )  Os  V„  S  +oc  for  all  i,  j  e  X,  i  *  j. 

(3)  r„§0  for  all  i ,je  X,  i  ^  j,  and  £  c„  =  1  for  all  i. 
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Regarding  the  small  parameter  {e(r)},  we  will  assume  that, 

(4)  Q<  e(t)<  1  for  all  /e3T, 

(5)  3JW  <  oc  such  that  r(t)§  Me (s)  whenever  t  g  s,  and 


(6) 


£  e(t)p<oc ■  for  some  p e [1, +oc). 
<  - 1 


In  what  follows  we  will  assume  that  in  (l)-(3)  we  have 

cu  =  0<=>  V,,  =  +oo, 

which  is  clearly  without  any  loss  of  generality.  We  shall  denote  by  N,  the  set  of  all 
states  j  with  c„>0.  Finally,  we  will  assume  that  the  Markov  chain  is  “connected;” 
i.e.,  for  every  i,j e  X,  there  exists  a  path  i  =  i0,  ■  •  • ,  ip  =  j ,  with  i,  £  A/,,  ,  for  1  g  /  g  p. 

Let  7r,(r);=  Pr(.x(t)  =  /)  be  the  probability  distribution  of  x(t),  and  let  it, ,(*):= 
Prl.v(r)=  i,  ,v(  i  +  l)=y)  be  the  pi  obability  of  a  transition  from  state  i  to  j  at  time  t. 

The  following  example  motivates  the  notion  of  “orders  of  recurrence"  introduced 
in  [7], 

Example  1.  Suppose,  for  a  certain  Markov  chain  (with  more  than  two  states!), 
we  have 


tr,(/)  =  l/r''\  *,(/)  =  l/r'\  e(t)=i/t'\ 


Then  note  that  T,\„  f(t)l7r,(f)  is  finite  if  c>/3,:=  2  and  +oo  if  cg/3,.  Similarly, 
1,'  ,,  f(  t )'  7r:(  r)  is  finite  if  c  >  1  and  +oc  if  eg  Now  7r,(  t)  converges  to  zero 

more  slowly  than  ir:(t)  and  it  is  easy  to  see  that  this  information  is  also  captured  bv 
the  demarcation  points  0,  and  which  thus  provide  a  measure  by  which  to  rank 
the  rates  at  which  tt , ( r )  and  7r:(t)  converge  to  zero. 

Motivated  by  this  we  define  the  recurrence  orders  for  the  states  and  transitions  of 
the  Markov  process,  as  follows. 

Dt  hnition  1.  The  order  of  recurrence  of  a  state  ie  X ,  denoted  0,,  is 


f-x 


0,  ;= 


P 


if  1  7T,t/)<+X, 

I  (I 

ifp  =  supjcgO:  X  e(i)'TT,(t)  =  +oo 

l  t  1) 

if  p  -  max  (  c  s  0:  V  f  ( / )'  rr, ( r )  =  +oc 


|  and  V  e(t )ptt, 

1 


(f)<  +oc, 


We  say  a  state  i  is  transient  if  fi,  =  — oc;  otherwise  we  say  the  state  is  recurrent. 

In  a  similar  manner  we  define  the  order  of  recurrence  of  the  transition  from  i  to  j. 
Db.f  inition  2.  The  order  of  recurrence  of  the  transition  from  state  /  to  j ,  denoted 

fi.n  IS 


if  I  7T„(  /  )  <  +X, 
t  n 


if  p  =  sup  j  c gO:  ^  F(f)‘7r„(r)  =  +oc[  and 


if  p  =  max 


cgO:  V  e(i )‘  tt„  ( t )  =  +oc 


V  f  ( t)rir„(t )  <  +ac, 

t  <) 


Again,  we  say  the  transition  from  i  to  j  is  transient  if  f}„  =  -oc;  otherwise  we  say 
the  transition  is  recurrent. 
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It  is  also  convenient  to  define  p ,  the  order  of  cooling  of  {eft)},  as  follows. 
Dleinition  3.  The  order  of  the  cooling  schedule  {eft)},  denoted  p,  is  defined  as 


if  X  eft)  <  +oe. 


if  p  =  sup  I  c  =  0:  V  e(l)‘  =  +oc  1  and  V  e(t)r  <  -Hoc, 

1  r-CI  J  i-o 

if  p  =  max  jcSO:  T  e(t)‘  =+ooj. 


The  relationship  between  /3,,  /3,,,  and  p  is  given  in  the  following  lemma.  It  will 
be  convenient  in  the  sequel  to  define  the  operation  “©"  as  follows: 


aQb 


{-oo  if  a  <  b, 

a-b,  if  a  g  b. 


Lemma  1.  /3„  and  (3,  are  related  by 
( 7 )  /}„=/},©  V„  for  all  t,  j  e  X , 

while  p  and  ji,  are  related  by 


(.81 


max  (3,  =  p. 


Proof.  If/ 2  /V,,  then  it  immediately  follows  that  f3„  =  —or.  If  /'e  N,,  then  application 
of  the  Chapman- Kolmogorov  equation 

tr„<  f )  =  TT,(t)p„(t) 

=  c„e(  t )  l"7T,(  t ), 

gives  the  first  assertion.  Similarly,  since 


1  f  I  t)r  =  f-f/i^TT,  ft), 

I  o  |  .— V  10 

the  second  assertion  also  follows.  □ 

Knowledge  of  the  /3,'s  provides  useful  information  about  the  asymptotic  properties 
of  {.v( / )}.  The  following  theorem  shows  that  the  time-inhomogeneous  Markov  chain 
converges  in  a  C'esaro  sense  to  the  set  of  states  having  the  largest  orders  of  recurrence. 
Theorem  1.  Let  .11  be  the  set  of  states  with  the  largest  orders  of  recurrence: 

■  It {i  e  X :  (3,  =  p}. 

Then 


I  N 

limsup—  x  Pr  (.v(f)e  .U )  =  1. 

V  •  v  N  ,  , 


Proof.  Let  us  first  consider  the  set  .It  defined  by 

p  ff  if  p  =0,  -oo  or  p  for  some  pz  .‘it,  p  >  0, 

.  t(\J  {i  c  X :  /3,  =  p  }  if  p  =  p  for  some  p  e  .JA,  p>  0. 

Note  that  if  p  =  p.  then  U  may  be  slightly  larger  than  M  since  it  includes  states,  if 
any,  whose  recurrence  orders  are  p  ;  otherwise  it  is  the  same  as  .{(.  We  will  first  show 
that 


1  N 

limsup—  £  Pr (x(t)zM)=  I. 


(10) 
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Consider  first  the  case  p  >  0.  Clearly,  p  =  p  or  p  for  some  p  e  5?,  where  p  >  0.  Let 
Q  =  {q  e  Jt  '.  for  some  &  =  q  or  q~}. 

Let  6  =  infvt  y  ( p  -  q ),  where  inf  0  =  +oo.  Let 

'6  if  0  <  +oo, 
l p  if  9  =  +oo. 

Consider  the  states  in  AC  and  observe  that  for  sufficiently  small  5>0, 


•={; 


£  Pr  (x(r)  e  Jic)e(l)p~y+Ii <  +oc, 

I  -o 

since  the  state  space  is  finite.  An  application  of  Kronecker's  Lemma  (see  Chung  [6]) 
gives 

N 

lim  e(N)p  £  Pr  (x(t)e  M‘  )  =  0; 

i  =  i 

that  is, 

1  N 

(11)  lim  (Nff/Vr  v+s)—  £  Pr(x(t)e.r  )  =  0. 

N 

Now  we  claim  that 

(12)  lim  sup  Ne(N)p~y*a  >0. 

N-jc 

Suppose  not.  Then, 

lim  Ne[N)p-y+s  =  0, 

.N  .  x 

and  so 


In  particular,  we  have 


implying  that 


1/(V 

1/N 


i1?. 


(  p  p  -  y  + 


=  +  oo. 


(1/  y+6) 

lim  - —r - =  +oo. 

A-*  cfAf) 


#>-* 


However,  since  „  f(f )p  *  =  +°o,  this  would  imply  that 

=  +oo  for  all  small  5  >  0, 


which  is  false.  Hence,  (12)  holds  and  from  (11)  we  deduce  that 

1  N 

lim  inf—  £  Pr(x(t)eJC)  =  0. 

N~x  N 


(13) 
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But  since 

—  Y  Pr(x(/)e  -«)  +  77  I  Pr  (x(<)€  M')  =  1, 

N  ,r,  jV  , = , 

the  result  (10)  follows. 

Now  turn  to  the  case  p  =  0.  Then  clearly,  Pr  (■*(<)  e  M')<  +oo,  and  so  (13) 
is  again  true  and  the  result  (10)  follows. 

If  p  =  -oo,  the  result  ( 10)  is  trivial. 

To  proceed  from  (10)  to  (9),  it  is  clearly  sufficient  to  show  that  in  the  case  p  =  p 
for  some  p  e  y?,  p  >  0, 

lim  Pr  (x(f)e  {i:  fi,  =  p~})  =  0. 

t  —  X 

This  involves  some  results  on  the  structure  of  the  recurrence  orders  and  is  demonstrated 
in  Lemma  5.  □ 

Thus,  knowledge  of  the  recurrence  orders  {&}  provides  knowledge  about  the 
asymptotic  properties  of  the  time-inhomogeneous  Markov  chain.  In  fact,  as  the  reader 
may  see  from  Example  1,  the  recurrence  orders  also  provide  information  about  the 
rates  of  convergence. 

Our  goal  therefore  is  to  determine  the  recurrence  orders,  and  critical  to  that  will 
be  the  following  result  established  in  [7],  which  shows  that  there  is  a  fundamental 
balance  of  recurrence  orders  across  every  edge-cut  in  the  graph  of  the  Markov  chain. 
Theorem  2  (Order  Balance). 

(14)  max  /3„  =  max  /3„  for  every  Ac  X. 

ii  A.j<  A'  i-  A./c  A' 

Equivalently ,  using  the  "Q"  notation  and  (7), 

(15)  max  fi, ©  V„  =  max  /3, Q  V',  for  every  Ac  X. 

if  A,/-  A'  it  .4,/v  A' 

Proof.  We  sketch  the  proof;  see  [7]  for  the  precise  proof.  Choose  Ac  X  and  note 
that  if  { t(  n)(„  .,  is  the  sequence  of  random  times  at  which  the  process  moves  from  A 
to  A',  while  {cr{n)}„  ,,  is  the  sequence  of  random  times  at  which  the  process  moves 
from  A'  back  to  A,  then  we  have 

t(h)  <  cr(n)  <  r(n  +  1), 

where  we  have  assumed,  without  loss  of  generality,  that  x(0)e  A  to  give  r(  1 )  <  cr(  1 ). 
Using  this  it  follows  from  (5)  that 

+.  X  +  X 

T  f  (;)‘7(.x(f)p  A',  x(t+  1)6  A)  =  £  e(o-(n))‘ 

l  O  n  I 

S  M‘  £  e(T(n))‘ 

n  I 

=  Mr  I  F(r),7(x(r)e4,.x(r+I)e A') 

t  0 

=  M‘  f  e( r(n  +  1))‘  +  M1f(t(  1 ))‘ 

n  ~  I 

E  X 

SM!'  I  f(<r(n))'  +  M:,E(0r' 

n  *=  I 

=  Mlc  I  e(  t )7(x(  t)  6  A\  x(  t  +  1 )  e  A)  +  M2t e(0)c. 

l  —  O 
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By  taking  expected  values  and  using  the  Monotone  Convergence  Theorem  it  follows 
that  ’ 


E/('r  1  *■,(»)  <+oo«*  l  eUY  l  njt)<  +oc 

--0  It  AJt  A  ,,0  ,cA\j,:A 

Hence  both  sides  above  converge  or  diverge  together.  Now  if  c  is  so  large  that  every 
term  on  the  left-hand  side  with  ie  A,  jeAc  converges,  then  clearly  c  is  also  so  large 
that  every  term  on  the  right-hand  side  converges.  Thus, 

r>  max  0o<ac>  max  0 v. 

K  A.jt  A'  icAJeA* 

Likewise  if  c  is  small  enough  so  that  some  term  on  the  left-hand  side  diverges  then 
c  ,s  a,so  smaI1  enou«h  so  that  some  term  on  the  right-hand  side  diverges,  and  so 

□ 


cS  max  max  0ti. 

i t  A,jc  A'  ieA,)e  A' 


Note  that  through  Theorems  1  and  2  we  have  converted  the  problem  of  determining 
the  asymptotic  properties  of  the  time-inhomogeneous  Markov  chain  into  an  algebraic 
problem  of  solving  the  balance  equations  (14).  Note  that  (14)  provides  a  maximum 
ot  r  equations,  one  for  each  edgecut. 

^The  ba,a"Ce  w>uations-  Note  ‘hat  if  (0, ,  02,  ■  ■  ■ ,  0|X|)  satisfy  ( 15), 

then  (0,  a,02-a,  ■  ■  •  ,ft*,-a)  also  satisfy  (15)  for  every  a,  i.e.,  the  solution  set  is 

translation  invariant.  Thus  (8),  which  fixes  the  maximum  of  the  ft’s,  also  needs  to  be 
taken  into  account. 

However,  (15),  (8)  together  can  still  possess  nonunique  solutions  for  sufficiently 
small  values  of  p.  In  this  section,  we  will  show  how  we  can  obtain  one  solution  to 
( 15),  (8);  in  the  next  section  we  show  how  to  obtain  all  solutions. 

In  cases  where  there  is  a  unique  solution  to  the  order  balance  equations  the 
algorithm  of  this  section  gives  an  0([X|’)  algorithm  for  determining  it,  compared  to 
the  algorithm  of  4  for  obtaining  all  solutions  (in  the  nonunique  case),  which  is 
exponential  ,n  \X\.  Also,  the  results  of  this  section  are  used  in  the  analysis  of  the 
simulated  annealing  algorithm  in  §  5. 

It  is  convenient  to  consider  the  following  “modified”  balance  equations  that,  as 
we^how  in  the  sequel,  always  possess  a  unique  solution.  Given  p^O  and  V,  SO  for 
thar  ’  "  ’  ’ '  WUh  '  *j' consider  problem  of  determining  A  :=  (A, ,  •  •  • ,  Aw)  such 


(16) 

and 

(17) 


max  A,  ki,  max  A,  —  V,,  for  every  Ac  (i 

*  ■<  A.>«  A •  1  ’ 


.IXIh 


max  Af  =  p. 


-Call-V,7>  the,“modlfied'-  balance  equations.  Observe  that  (16)  differs  from 
in  116)  H  °Peratl0n  is  used  in  P|ace  of  “©•”  Also,  the  A's  can  be  negative 

We  have  introduced  the  modified  balance  equations  to  avoid  the  difficulties  in 
handling  -oc  that  occur  under  the  “©"  operation. 

^  M,TAEOrAEM3T^erieSof0rderBalanceandModifiedBaIanceEcluations)-  (1)  If 

A  satisfies  the  modified  balance  equations  for  a  given  p  and  V,  then  0  defined  bv 

(l8)  ft:=A(©0 

satisfies  the  order  balance  equations  (15),  (8)  for  the  given  p  and  V. 
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(2)  For  every  given  p  and  V,  there  exists  a  unique  solution  A  to  the  modified  balance 
equations.  Moreover,  the  solutions  for  different  values  of  p  are  translates  of  each  other. 

(3)  Whenever  p  is  large  enough,  there  exists  a  unique  solution  to  the  order  balance 
equations  (15),  (8).  These  unique  solutions  are  all  translates  of  the  solutions  for  the 
modified  balance  equations. 

Proof.  Suppose  that  for  a  fixed  p  and  V,  there  exist  two  distinct  solutions  p  and 
P  to  the  order  balance  equations.  Define 

A:={keX:pk^pk). 

Then  we  claim  that 

max  Pi  ©  V;;  =  max  PjQVji  —  — oo 

it  A,jt  A'  ii  A.jf  A‘ 

and 

max  piQVi,=  max  /3,©  Vt,  =  — oo. 

i*'  A.jt  A'  i*  A.  j <  A' 

We  need  only  consider  the  case  where  A  ^  0  and  A  #  X  (otherwise  the  claim  is  trivially 
true),  and  let  us  suppose  to  the  contrary  that  both  expressions  are  nonnegative.  Then 

max  /3,©V,(=  max  P,QV „>  max  P, ©  V„  =  max  p,QVtj  g  max  p,©Vh, 

i*  AJ‘.  .V  h.A.b'  A'  i-  A.)*  A"  iiAjzA'  i-  AJ<-  A' 

which  is  a  contradiction.  The  other  two  cases  follow  similarly,  and  so  the  claim  is  true. 
This  shows  that  solutions  to  the  order  balance  equations  do  not  differ  arbitrarily; 
specifically,  all  the  arcs  that  separate  A  from  A‘  are  transient. 

Hence  in  particular,  whenever  we  can  show  that 

(19)  p,QVi,S0  for  all  i,j,  with  i  #  j,  and  Vf)  <  +oo, 

there  can  only  exist  one  solution  to  the  order  balance  equations  for  the  given  (p,  V). 

Now  we  show  that  this  is  indeed  the  case  when  p  is  large,  which  will  prove  the 
first  part  of  the  assertion  (3)  above.  Specifically,  suppose  now  that  p  a  2  £t).  v„.  +  ,  V„. 

Let  i„e  X  be  a  state  with  p ki  =  p.  For  arbitrary  s  €  X,  let  ( i*  =  i, ,  •  ■  ■ ,  in  =  s)  be 
a  path  from  /'*  to  s  such  that  V„  ,  1(  <  +oc  for  k  =  1,  •  •  • ,  p  and  ik  *  im  for  k  *  m.  Let 
/( i)  =  arg  min,  V„.  With  A  =  { 4 }  and  applying  the  Order  Balance  Theorem  2,  it  is  easy 
to  see  that 


(20) 


J8,  ,©n 


ip„ev, 


•k .  Ik  > 


;  max  (PikQ  V,k.,). 


To  prove  that  p ,  s  max,.>;  v„  V„  <  +oo,  it  is  sufficient  to  show  that  for  k  =  1,  •  •  • ,  p, 
along  the  path  from  i*  to  s. 


(21) 


Pu  =  P„, -  V, . +  v„.( 


.  +  V, 


i .  u  n  > 


since  P„,  =  p  g  2  S,,:  + 1  V„. 

We  prove  (21)  by  induction.  For  k  =  1,  from  (20)  we  see  that 


(22) 

Clearly,  the  left-hand  side  of  (22)  is  nonnegative,  implying  that  the  right-hand  side  is 
also  nonnegative.  Thus,  we  can  replace  “©”  with  giving 


(23)  P^P<-V.„,,+  V,,.ioo. 

Now  assume  (21)  holds  for  fc-1.  From  (20)  we  have 


(24) 


p,t  ,©v(l  lAs^evw 


»*)• 
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The  left-hand  side  of  (24)  is  nonnegative  and  so 


P^Sf} 

S/3 


—  V  +  V 

Ik  I  Ik. /(<<.) 

_  v  -i-  V  —  V 

ki  r  ki.'l  r‘i.Ki,> 


+  Vi,. /(.;>- 


V'k-,.‘k+Vlk.l<ik>’ 


which  completes  the  induction  proof.  This  proves  (19),  and  therefore  there  exist;  a 
unique  solution  whenever  p  is  large  enough,  which  is  the  first  half  of  assertion  (3)  above. 

Moreover,  for  the  large  enough  p  specified  earlier,  due  to  (19),  we  have  p,QVtl  = 
Pi  -  Vy.  Hence  {£,}  itself  satisfies  the  modified  balance  equations.  In  fact,  this  solution 
is  unique  to  the  modified  balance  equations  since,  if  A  is  any  other  solution,  then  we 
can  prove  in  a  fashion  similar  to  the  above,  that  A ,  g  Vy  for  all  j  £  Nf,  thus  yielding 
that  A ,©  Vy  =  A,  -  Vy,  which  in  turn  proves  that  A  is  yet  another  solution  to  the  order 
balance  equations,  which  is  a  contradiction. 

Hence,  at  least  for  large  enough  values  of  p  we  have  proved  the  existence  of  a 
unique  solution  to  the  modified  balance  equations.  However,  it  is  easy  to  see  that  if 
A  satisfies  the  modified  equations  for  a  given  ( p,  V),  then  A  -  S  satisfies  the  modified 
balance  equations  for  (p-8,  V),  thus  proving  the  existence  of  a  unique  solution  to 
the  modified  balance  equations  for  all  (p,  V).  This  proves  the  assertion  (2)  as  well  as 
the  second  half  of  the  assertion  (3)  above. 

Now  we  turn  to  the  proof  of  assertion  (1)  above.  Let  A  be  arbitrary,  and  let  {A,} 
be  the  solution  of  the  modified  balance  equations,  and  define  p,  :=  Af©0.  Suppose 


max  Aj  -  Vy  <  0. 

iir  A, Je  A' 

Then  by  (16)  we  also  have 


max  -  Vy  <  0. 

it  A.jk.  A' 

However,  then  for  each  i  e  A  and  j  £  A', 

P,  =  A,  <  Vy  and  p,^\,<V„. 

Hence, 

(3,  ©  Vy  =  -00  and  /},  ©  Vit  =  -oc, 

and  so 


max  /3,©Vy=  max  /3,©V„, 

i*:  A.ji  A'  it  A.  A' 

thus  satisfying  the  original  order  balance  equations.  If,  however, 

max  A,  -  Vy  =  5  g  0, 

!■  A.).  A' 


then  by  ( 16) 


max  A,  -  Vjj  =  &  g  0. 

it  A.j't  A' 

Suppose  that  (ii,jt)eAxAr  and  (i2,  j2)e  A'  x  A  are  such  that 

—  V(|  ,y,  »*  Ay,  —  Vh  i 2  =  S. 

Then  since 


Ay  =  V(| ,y  +  «g0  and  A k  =  V„.i;  +  S  g 0 
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we  have 

/3„  =  A„  and  0,.  =  A,,, 

and  so 

Also,  since  A*  s  /3t,  we  have 

max  /?,©  s  max  A,©  V.j 

i,  .4,  / *  4'  i>  AJ.  4' 

§  max  A,  -  vu 

i<:  AJc  .4* 

=  A1|-V„.7, 

=  /3„-V„,l 

=  ^,©V„.7V 

Similarly,  max,,.,,.,,  a-  /3,Q  V„  =/3J:©  and  so 

max  /3,©V,,  =  max  f}j@Vh. 

i  nA.jaA' 

This  proves  the  assertion  ( 1 )  and  the  theorem.  □ 

Remark  1.  It  is  interesting  to  note  that  the  existence  of  a  solution  to  the  modified 
balance  equations  has  been  proved  by  relying  on  the  existence  of  a  solution  to  the 
order  balance  equations,  which  in  turn  is  guaranteed  by  the  probabilistic  arguments 
of  Theorem  2.  A  separate  independent  constructive  proof  of  existence,  which  does  not 
use  probabilistic  arguments,  can  be  found  in  [8]. 

We  now  give  an  algorithm  for  determining  the  unique  solution  to  the  modified 
balance  equations.  An  illustrative  example  is  convenient. 

Example  2.  Let  p  =  5  and 

4  3  1 

★  37 

2*4' 

6  5  ★. 

Our  goal  is  to  determine  A  =  (A, ,  •  •  • ,  A4),  which  satisfies  (16),  (17).  We  shall  refer  to 
A,  -  V„  as  the  A  -flow  along  the  arc  (»,  j).  Consider  first  the  modified  balance  equation 
for  the  edge  cut  A  =  {(}, 

(25)  max  A,  -  V„  =  max  A,  -  V,,. 

i*t 

Observe  that  the  left-hand  side  of  (25)  can  be  written  as 

A,  -  min  V,„ 

/*» 

and  so  the  arc  of  maximum  \-flow  out  of  A  =  {i}  is  the  arc  (i,  /(f))  where 

/( i)  =  arg  min  VtJ. 

i*  i 

(Note  that  Hi)  may  not  be  unique.) 

We  now  construct  the  directed  graph  G,  =  ( V, ,  E,),  with  V,  =  {{1},  •  •  • ,  (4}}  and 
(i,j)e  £,  if  j  =  /  ( / ) .  See  Fig.  1. 


V  =  [VJ 


f* 

6 
6 
l  2 
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Note  that  G,  has  two  directed  cycles  and  {2} -*{3} Let  os 

examine  the  A -flows  on  the  directed  cycle  {!}-» {4}  -*{)}.  Since  A,  -  V,4  is  the  maximum 
A-flow  out  of  {1},  it  is  not  smaller  than  any  A-flow  into  {1},  and  so  in  particular 

A  i  ~  V|4  5  A4  ~  V4, . 

Also,  A4  —  Vix  is  the  maximum  A-flow  out  of  {4}  and  so 

A4 -  V4l S  A|  —  Vu- 

We  thus  observe  that  the  A -flows  along  the  directed  cycle  {1}  -»  {4}  -» {I  f  are  equal ;  that 


IS, 

A  |  V[  4  —  A  4  V4I 

and  so 

(26) 

A  j  —  1  —  A4  2. 

Thus,  we  have  determined  the  difference  between  Ai  and  A4. 

In  exactly  the  same  way,  from  the  directed  cycle  {2}-*j3}-»(2}  we  see  that 

(27)  A,-3  =  A,-2, 


thus  determining  the  difference  between  A;  and  Aj. 

At  the  next  step  of  the  algorithm,  consider  the  modified  balance  equations  for  the 
edge  cut  (A,A‘)  where  A  =  {  1,4}  and  A‘={2,3}.  Observe  that  for  A  =  {1,4},  the 
left-hand  side  of  the  modified  balance  equation 

(28)  max  A.-V^  max  A j-V# 

ic  A.  i*:  A'  if.  A>jc-  A' 

can  be  written  as 


max  ( A i  —  ,  A j  V|j,  A4  V<2»A4  L4, ) ■ 


that  is. 


max  (A(-4,  Ai  -3,  A4--6,  A4-5). 
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We  have  previously  determined  that  A4-  A,  =  1,  and  so  the  maximum  is  achieved  by 
A,  -  V, ,  =  At-3,  and  the  arc  of  maximum  A-flow  out  of  {1,4}  is  the  arc  (1,3). 

In  a  similar  fashion,  examining  the  right-hand  side  of  the  modified  balance 
equation  (28),  we  determine  that  the  maximum  A-flow  out  of  {2,3}  is  achieved  by 
A,-  V,4  =  A3-4,  and  so  the  arc  of  maximum  A-flow  out  of  {2,3}  is  (3,4). 

We  now  consider  the  directed  graph  G2  =  (V2,£2),  with  V2  =  {{1, 4},  {2,  3}}  and 
£,  =  {(1,3),  (3,4)}  shown  in  Fig.  2.  Note  that  E2  is  the  set  of  the  arcs  of  maximum 
A-flow  out  of  the  edge  cuts  in  V2. 


Fig.  2.  The  graph  G,  oj  Example  2. 


Observe  that  G2  has  a  directed  cycle  {1, 4}-»  {2,  3}  -*  {1, 4}.  Now  note  that  A,  -  K,, 
is  the  maximum  A-flow  out  of  { I,  4}  and  A  ,  -  V,4  is  the  maximum  A-flow  out  of  {2,  3} 
and  so 

A  |  —  Vj ,  =  Aj  —  V(4; 

that  is, 

(29)  A, -3  =  A, -4. 

Combining  (26),  127),  and  (29),  we  obtain 

(30)  A,  -  3  =  A2  -  5  =  A,  —  4  =  A4  -  4. 

We  now  know  the  pairwise  differences  between  all  of  the  A,'s,  and  so  we  do  not  need 
to  consider  any  additional  edge  cuts.  To  fix  the  values  of  {A,},  we  use  the  value  of  p 
to  give 

max  A ,  =  p  =  5. 

i*  X 

Since,  from  (30),  A;  is  the  largest,  we  set  A2=  5.  We  thus  obtain  the  solution  to  the 
modified  balance  equations: 

A,  =3,  A,  =  5,  A,  =  A4  =  4. 

The  principal  idea  used  to  solve  the  modified  balance  equations  in  Example  2  is 
summarized  in  the  following  lemma. 

Lemma  2.  ( 1 )  Given  Ac  X  for  which  we  know  the  pairwise  differences  between  all 
the  A's  for  states  in  A,  we  can  determine  the  arc  of  maximum  A-flow  out  of  A  (without 
knowing  the  A,'s  themselves). 

(2)  Let  A|,  A2,  •  •  • ,  Ap  be  a  partition  of  X  and  suppose  for  each  A*  we  know  all 
the  pairwise  differences  between  the  A,'s  for  all  states  in  A*.  Let  ( U ,  jk )  denote  the  arc 
of  maximum  A-flow  out  of  Ak.  Construct  the  directed  graph  G  =  (V,  E),  with  V  = 
(A i,  •  •  ■ ,  AP}  and  £  =  {(», ,jt),  •  •  ■  ,(iP,jr)}.  There  exists  a  directed  cycle  on  G.  If 
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{A„  ,  ■  ■  ■ ,  ,4„  }  is  the  list  of  vertices,  in  order,  along  the  directed  cycle,  then  the  K -flow- 
on  the  directed  cycle  is  constant-  that  is. 


and  we  can  determine  the  pairwise  differences  between  the  values  of  the  A,’s  for  all  the 
states  in  UT-i  A>,- 

Proof.  (1)  Without  loss  of  generality,  suppose  A  is  the  set  of  states  {1,2 r). 
Let  a,  :=  Ai  -  A,.  (We  know  the  a^s.)  Then 

max  As  —  V,,  =  max  A,  —  at  —  =  A,  —  min  (a>+V,,). 

i>  A.ii  A'  icAjcA 4  it  AJe  A' 

Thus,  the  arc 

(i*,j*)-=  arg  min  (a,+  V„) 

it  A.je  A' 

is  an  arc  of  maximum  A-flow  out  of  A. 

(2)  The  out-degree  of  each  vertex  of  G  is  at  least  one,  and  so  from  elementary 
graph  theory  it  follows  that  G  has  a  directed  cycle.  Suppose 

An^Aa,  -*-  ■  ■  -»  A„ui  ->  A„, 

is  such  a  directed  cycle.  Then  we  have  the  situation  shown  in  Fig.  3.  Now  (i,h,  j,H)  is 
the  arc  of  maximum  A-flow  out  of  Ani,  and  so  the  A-flow  on  this  arc  is  not  less  than 
the  A-flow  of  any  arc  into  A„L.  In  particular. 


As- V,vS2A,„ 


for  k  =  1, 


where,  for  convenience,  we  implicitly  identify  t%  with  i„m  and  with  j„  i:.  Thus, 


A,„ 


SA,„  -Vf„ 

=  As..  r  v‘».„ 


Fig.  3.  A  directed  cycle  of  maximum  A -flows  in  Lemma  3. 
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Therefore,  the  A -flow  on  the  directed  cycle  is  a  constant: 


(31) 


A.„  -  Vf„  =  A,  -  V,.  =  •  ■  •  =  A, 

"|  **,  h2  ” J  ,h2  ni 


For  each  A„t  in  the  directed  cycle,  we  know  the  pairwise  differences  between  the  A('s 
for  states  in  A„k.  Using  (31)  we  can  now  easily  determine  the  pairwise  differences 
between  all  the  A,’s  for  states  in  Uf=,  A„k.  □ 

The  algorithm  for  solving  the  modified  balance  equations  is  outlined  below. 


Algorithm  to  Solve  Modified  Balance  Equations. 

Step  1.  Set  A)  =  { i}  for  i  =  1,  •  •  • ,  |X|.  We  call  the  Af's  coalitions  at  step  k.  Note 
that  for  every  i,  the  pairwise  differences  between  the  A-values  for  all  states 
in  Aj  are  (trivially)  known.  Set  A1  :-=  {A,1,  Af,  •  •  • ,  A'^iJ.  Let  JV(1)  =  |A'|=:  the 
number  of  elements  in  the  set  A1  =  number  of  coalitions  at  Step  1. 

Step  k.  Given  Ak  :=  { Af ,  A.( ,  •  •  ■ ,  A‘nih),  where  for  each  A*  e  Ak  the  pairwise 
differences  between  all  of  the  A,'s  for  i’s  in  Af  are  known,  construct  A*+l  as 
follows.  Using  Lemma  2,  identify  the  directed  cycles  in  the  graph.  (There 
exists  at  least  one  directed  cycle.)  The  elements  of  Ak  +  I  consist  of  the  directed 
cycles  identified  in  the  graph,  and  those  Af  e  Ak  that  are  not  in  any  directed 
cycle.  (More  precisely,  if  { A^, ,  Ak ,,  •  •  • ,  AkJ  is  a  directed  cycle,  then  (J  "L ,  A„ 
is  an  element  of  A*+l.)  Note  that  for  every  Af  *’  €  Ak*\  the  pairwise  differences 
between  all  of  the  A,'s  for  i's  in  Af  +  1  are  known.  Furthermore,  if  N(k):~  |  Afc  j, 
then  N(fc  +  1)<  N(k). 

Last  Step.  Stop  when  N(k)  =  1.  Note  that  the  pairwise  differences  between  all 
A.’s  are  known,  and  the  A  satisfying  the  modified  balance  equations  can  be 
obtained  by  a  translation  by  using  the  given  value  of  p.  □ 

4.  An  algorithm  to  obtain  all  solutions  of  the  order  balance  equations.  We  now 

characterize  all  solutions  to  the  order  balance  equations,  and  describe  an  algorithm 
for  generating  all  these  solutions.  To  do  so  we  will  use  the  coalitions  {Af}  generated 
by  the  algorithm  of  the  preceding  section.  Let  us  call  A,  -  Vj(  and  /3„  =  fi,Q  V„  as  the 
k-flow  and  fi -flow,  respectively,  along  the  arc  (/,  j). 

Lemma  3.  (1)  If  ( i,  j )  is  an  arc  of  maximum  k-flow  out  of  Af ,  then  it  is  also  an 
arc  of  maximum  fi-flow  out  of  Af . 

(2)  If  {Ak  ,  Ap\  is  a  directed  cycle  obtained  at  step  k,  then  the  fi-flow  along  the 
directed  cycle  is  a  constant. 

(3)  If  the  fi-flow  along  the  directed  cycle  {Af ,  •  ■  ■ ,  Af}  obtained  at  step  k  is  -oc, 
then  the  fi-flow  along  any  directed  cycle  obtained  at  step  n>  k  containing  A"  =  U f  ,  Ak 
as  a  node ,  is  also  — x. 

(4)  If  the  fi-flow  along  the  directed  cycle  { At ,  •  •  • ,  A*}  obtained  at  step  k  is  gO, 

then  for  every  iJ<=Ak*' U pm  ,  Akm  there  exists  a  path  (/  =  /„,  i,,  -  •  ■ ,  iq  =  j)  such  that 
i„  e  Ak ' '  ana  SO  for  O^m^q-1. 

Proof.  We  will  first  prove  <l)-(3)  by  induction.  Consider  k  =  1.  Since  Af  is  then 
just  a  singleton,  say  Ak  =  {/},  an  arc  (/,  m)  of  maximum  A-flow  out  of  {/}  is  just  one 
for  which  Vlm  =  min„  V,„.  Clearly  this  is  also  an  arc  for  which  fi,Q  Vim  =  min*  fi,Q  V,„. 
Now  suppose  that  {Af ,  •  •  • ,  Af}  is  a  directed  cycle  of  such  maximum  flows.  Then  an 
application  of  the  Order  Balance  Theorem  to  each  Af  shows  that  0,,  =  0:_,  =  •  •  •  =  fipl . 
Suppose  now  that  fin  =  fiiy  =  ■  ■  •  =  fipl  =  -oo.  Then  if  (/,  m)  is  an  arc  of  maximum 
/?-flow  out  of  Uf, i  Af,  clearly  j3/msj8UM  =  -oo.  Thus  the  assertion  is  true  for  k  =  l. 

Now  suppose  that  the  assertion  is  true  for  1,2,  •  •  • ,  fc-1.  Consider  a  coalition 
Af .  If  the  j3-flow  along  some  directed  cycle  {Af ,  •  •  • ,  Af}  at  some  step  n  <  k  with 
Af  =  U i  A"  was  -oo,  then  clearly  the  maximum  /3-flow  out  of  Af  is  -oo,  and  so  any 
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arc  out  of  Ak  is  an  arc  of  maximum  /3-flow.  On  the  other  hand  if  the  /3-flow  along  the 
directed  cycle  (A",  -  •  • ,  Aq]  is  ^o,  then  the  differences  between  the  /3,'s  for  states 
i  e  Ak,  are  the  same  as  the  differences  between  the  A,’s,  i.e., 

(32)  0.  -  0,  =  A,  -  A,  for  all  i,jeAk, 

and  so  the  arc  of  maximum  A-flow  out  from  Ak  is  also  an  arc  of  maximum  /3-flow  out 
from  Ak .  Moreover,  if  {Af,  •  •  • ,  Ak}  is  a  directed  cycle  at  step  k ,  then  an  application 
of  the  Order  Balance  Theorem  to  each  Ak  shows  that  the  /3-flow  along  the  directed 
cycle  is  a  constant.  Finally,  if  this  /3-flow  is  — oo,  suppose  that  (r,  m)  is  a  maximum 
flow  arc  out  of  U/L,  A,*.  Suppose  that  re  Ak.  Then  clearly  max,t4;Jt/4y  j0,y§/3rm  and 
so  /3rm  =  -oo.  This  completes  the  induction  and  the  proof. 

Finally,  to  see  (4),  note  first  that  from  (1),  (2),  and  (3),  the  /3-flow  along  any 
directed  cycles  contained  within  Akki  is  ^0.  Since  A*  +  l  is  formed  as  the  union  of  such 
directed  cycles,  the  result  follows.  □ 

Motivated  by  (3)  and  (4)  above,  we  introduce  the  following  definition. 
Definition  4.  We  shall  say  that  i  is  recurrently  connected  to  j  if  there  exists  a 
path  (i  =  J<„  i,,  •  •  • ,  iq  =j)  with  /3,m  4,„g0  for  0^  m  -  1. 

We  shall  say  that  a  set  A  s  X  is  a  recurrently  connected  set  if  for  every  i,  j  e  A  and 
k  e  A',  i  is  recurrently  connected  to  j  but  not  to  k. 

From  Lemma  3  it  follows  that  recurrently  connected  sets  are  precisely  those  Af"s 
for  which  the  /3-flow  out  of  Ak  is  -oo,  while  the  /3-flows  along  the  directed  cycles 
contained  within  Ak  are  SO.  Note  also  that  the  recurrently  connected  sets  form  a 
partition  of  X. 

We  now  proceed  to  determine  which  sets  are  possible  candidates  for  being 
recurrently  connected  sets.  Consider  a  typical  candidate  Ak  +  I.  Let  3-  denote  the  /3-flow 
on  the  cycle  {Af,  •  •  • ,  A*},  where  A**1  =  UJL,  Af.  Then  if  ( im,jm )  is  the  arc  of 
maximum  flow  out  of  Akm  (and,  by  construction,  into  A, ,,nodp),  we  must  have 

&  ~  V,,  j,  =  /3,,  ~  V'j.j,  =  •  •  •  =  /3V  -  K(„,r  =  0, 

max  /3,  -  Vj(  <  0,  max  /3,  S  p. 

l.  A)" 

We  will  now  attempt  to  determine  whether  there  exist  {f3,:  i e  Ak*'\  that  satisfy  these 
conditions.  Note  that  if  this  is  not  feasible,  then  A?  +  l  cannot  be  a  recurrently  connected 
set. 

Let  (x,  >•)  denote  the  arc  of  maximum  /3-flow  out  of  Af  +  I.  Then  /3,  <  V,,  .  Fix  m 
to  be  an  arbitrarily  chosen  state  from  Aj1*1.  Then  for  every  state  h  e  Ak*'  we  know  the 
value  of  (/3 h  ~  /3,„)  from  Lemma  3  above.  Let  us  define 

(h  :=  Ph  -  Pm- 

Then 

=  Pn,  +  Cl,-Vil.J, 

=  Px-ix  +  (i,~  Vj,.,, 

<  vv  -Ci  +  St,  M„ 

giving  an  upper  bound  on  9. 

We  must  also  satisfy  the  constraint  max /3,  S  p,  and  so  let 

0  ■-  arg  max 

ic  A  *  *' 
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Then  it  is  clear  that  ft,Smaxlcv,M  ft-  Thus, 

pSft. 

=  ft, -ft, +  6 

=  ft,-  K,j,+  K,  +  6. 

=  *+V,  ;-(>+(„, 

and  so 

^SP'  V^, -f8=:  M3, 

giving  yet  another  upper  bound  on  (Note.  If  A;  +  I  =  {i},  then  A4,  =  minj  VJ,  and 

M:  =  p.) 

Any  choice  of  ^  from  the  interval 

fl(  A*+ ’)  :=  [0,  M,)n[0,  M,] 

will  allow  assignments  for  the  recurrence  orders  of  states  in  A)*'  consistent  with  the 
assumption  that  the  coalition  Af  +  ’  is  a  recurrently  connected  set.  If  Q.(Af*')  =  0  then 
then  there  is  no  assignment,  and  so  Af  +  l  is  not  a  recurrently  connected  set. 

We  still  need  to  determine  the  set  of  all  recurrently  connected  sets.  To  do  this  we 
construct  a  rooted  tree  having  the  coalitions  produced  by  the  general  procedure  as 
nodes,  and  having  a  directed  edge  from  coalition  Ap  +  1  to  Akr  if  Aj°  2  A1/1.  Hence, 
the  root  of  the  tree  is  X,  and  its  leaves  are  the  singleton  sets  { I },  {2},  •••,{«}.  Let  D, 
be  the  set  of  the  leaves  of  the  tree  that  are  descendants  of  the  node  i  in  the  rooted  tree. 
We  say  that  a  set  E  of  nodes  is  a  proper  cover  if 

U  DA  =  X 

A-  H 

and 


DA  DDa=0  for  A /A'. 

Now  the  algorithm  to  determine  all  the  solutions  of  (15),  (8)  proceeds  as  follows. 
Let  a  set  E:=  {A|,  A2,  ■  ■  • ,  A*}  be  a  proper  cover.  Now  we  will  determine  whether  E 
can  be  a  set  of  all  recurrently  connected  sets,  as  follows.  First  we  determine  fl(  A,)  for 
every  A,  e  E.  (Note  that  if  we  guess  X  to  be  a  recurrently  connected  set,  then 
fl(X)  =  [0,  M3],  since  the  M,  upper  bound  is  +oo  because  there  is  no  maximal  flow 
out  of  X.  Also,  if  we  guess  the  singleton  {/}  to  be  a  recurrently  connected  set,  then 
(l({i()  =  -oc (J ([0,  M,)0[0,  M3]).  If  any  of  the  fl(Ay)’s  is  empty,  then  the  guess  E  is 
not  a  feasible  set  of  recurrently  connected  sets.  If  every  fl(A;)  is  nonempty,  then  let 
uF(:=supft(Ay).  If  this  “sup"  is  not  attained,  then  we  cannot  assign  p  to  any  state  in 
A,.  If  this  “sup”  is  attained,  then  we  determine  for  each  such  A,  whether,  with  the 
choice  of  there  is  a  state  i,  €  A,  with  ft  =  p.  If  no  such  state  exists  for  any  A,,  then 
again  E  is  not  a  feasible  set  of  recurrently  connected  sets.  Finally,  if  there  exist  such 
A,'s  then  let  ,stf(E)  be  the  set  of  all  such  Af  s.  Now,  the  set  of  all  solutions  corresponding 
to  E  is  obtained  by  picking,  in  turn,  an  A,  from  j#(S),  fixing  its  flow  as  and  choosing 
all  other  arbitrarily  from  the  fl(  A, )'s.  By  checking  every  proper  cover  E,  we  thus 

determine  all  solutions  to  the  order  balance  equations,  as  the  following  theorem  shows. 

Theorem  4.  All  solutions  to  the  order  balance  equations  can  be  generated  by  using 
the  method  described  above. 
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Proof.  Suppose  /3  satisfies  the  order  balance  equations.  Then  for  this  solution 
determine  the  set  E  of  recurrently  connected  sets.  This  set  must  be  a  proper  cover. 
For  this  set  E,  there  must  be  some  A 1  with  corresponding  /3-flow  equal  Jr  Now 
determine  the  /3-flows  on  the  recurrently  connected  sets.  We  generate  this  solution  /3 
when  we  choose  E  as  the  set  of  recurrently  connected  sets,  and  A,  as  the  coalition 
with  maximum  flow  equal  to  #/5  and  assign  the  correct  /3-flows  on  the  other  recurrently 
connected  sets.  □ 

This  algorithm  takes  an  exponential  in  |X|  number  of  steps,  due  to  the  necessity 
of  checking  all  proper  covers.  However,  the  complexity  issue  is  not  the  primary  concern 
here,  since  the  problem  of  asymptotic  analysis  of  the  stochastic  process  is  not  a  priori 
known  to  be  a  problem  resolvable  by  a  finite  algorithm. 

We  illustrate  the  procedure  for  determining  all  solutions  to  the  order  balance 
equations. 

Example  3.  We  construct  all  solutions  to  the  order  balance  equations  for  Example 
2  when  p  =  4.  See  Fig.  4  for  the  rooted  tree.  We  check  the  proper  covers: 

( 1)  E  =  {A"}:  fl(X)  is  empty,  so  X  cannot  be  a  recurrently  connected  set. 

(2)  E  =  {{ 1,  4},  {2,  3}}:  Using  the  method  described  above  we  obtain 

6,  =  a,  /3,  =  4,  0,  =  3,  /34=1  +  o 

where  |ga<3. 

(3)  E  =  {{l,  4},  {2},  {3}}:  max,.  v  /3;  < 4,  a  contradiction. 

(4)  E  =  {{1},{4},{2,3»: 

/3,  =  y,  0;  =  4,  0,-3,  0  *=  9 

where  y  =  -oe  or  0  §  y  <  1 ,  and  0  =  -oc  or  0  s  0  <  2. 

(5)  E  =  {{11,  {2},  {3J,  {411:  max,,  x  0,  <4,  and  so  {{11,  {21,  {31,  {4}1  is  not  a  set  of 
recurrently  connected  sets. 


We  have  checked  all  proper  covers.  Hence  the  set  of  all  solutions  is  {(a,  4,  3,  1  +  a):  is 
a  <  3}U  {( y,  4,  3,  9):  y  =  -<x>  or  OS  y  <  ;  and  9  =  —  oo  or  OS  9  <  2}. 

How  can  nonunique  solutions  to  the  order  balance  equations  arise,  and  what  is 
the  implication  of  such  nonuniqueness?  First  let  us  consider  the  case  where  a  unique 
solution  exists.  Since  such  a  solution  is  uniquely  determined  by  the  algorithm,  it  is 
clear  that  the  recurrence  orders  of  the  states,  and  thus  the  rates  of  convergence  of  the 
transition  probabilities,  depend  only  on  the  Vy’s  in  the  transition  probabilities  = 
Ci/f(Ov‘',  and  not  on  the  proportionality  constants  {c„}.  However,  in  the  case  of 
nonunique  solutions,  the  following  example  shows  that  the  recurrence  orders  may  even 
depend  on  the  proportionality  constants  { c(/ } . 
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Example  4.  Let  X  =  {1, 2,  3}  and  V„  =max  {0,  j-i}.  Let  c,,  =  c,,=  1,  c„  =  1  -  a, 
and  e,,  =  a,  where  ae(0,  l).  Set  c(i  =  0  for  all  other  i,  j.  See  Fig.  5.  Let  the  cooling 
schedule  be  f(/)=  1/l  Then  the  complete  set  of  order  balance  equations  obtained  by 
using  all  edge  cuts  is: 

p2Q  V,,  =  0,0  VH,  P ,0  V3I  -  /J,  ©  V„, 

max  (P2Q  V,,,  0,0  V,.,)  =  max  1/3,0  Vi2,p,Q  V„), 
with  the  maximum  given  by, 

max  p,  =  1. 


The  assignments 

p,  =  1,  /3;  =  y,  /3,=  -<x> 

satisfy  the  order  balance  equations  for  every  y  £  { -oc}  U  [0,  1 ).  Thus  any  value  of  p2  <  1 
gives  a  solution  of  the  order  balance  equations. 

However,  a  calculation  that  can  be  found  in  [8]  shows  that  the  correct  order  of 
recurrence  of  state  2  is 


p2  =  a. 

Thus,  the  order  of  recurrence,  and  the  rate  of  convergence  of  the  probability  Pr  l.vl  / )  =  2) 
to  zero,  depends  on  the  proportionality  constant  c,;  =  a  involved. 

Based  on  the  above  results,  we  obtain  the  following  property  of'1"  orders  of 
recurrence  of  the  states  in  a  recurrently  connected  set. 

Lemma  4.  Consider  a  recurrently  connected  set  A. 

( 1 )  //  p,c  Jt  for  some  i  e  A,  then  p,e  dt  for  a!!  j  <-  A. 

( 2 )  If  for  some  i  z  A,  p,  =  p,  for  some  p,  e  A,  then  for  every  j  c  A,  P,  =  p,  Jar  some 
p,  €  A. 

Proof  The  proof  follows  immediately  from  (32).  □ 


FlCi.  5.  The  Markov  process  of  Example  4, 
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Thus  all  recurrence  orders  in  a  recurrently  connected  set  are  of  the  same  type, 
i.e.,  either  they  are  all  real  numbers  p,,  or  they  are  all  of  the  type  p,  ,  or  they  are  all 
-oo  (see  Definition  1). 

This  gives  us  the  following  lemma,  which  completes  the  proof  of  Theorem  1. 

Lemma  5.  Suppose  the  rate  of  cooling  is  p  =  p  e  ai,  with  p>  0,  i.e.,  the  maximum  is 
achieved  in  Definition  3.  If  there  is  a  state  i  e  X  for  which  fi,  =  p~ ,  then  lim, . ,  Pr  (x(  t)  = 
i)  =  0. 

Proof.  Suppose  A  is  the  recurrently  connected  class  to  which  /  belongs.  Since  all 
arcs  between  recurrently  connected  sets  are  transient,  it  follows  from  the  Borel-Cantelli 
Lemma  that  along  almost  every  sample  path  w  there  can  only  be  a  finite  number  of 
transitions  between  different  recurrently  connected  sets.  Hence  for  almost  every  w, 
{.v(  t,  ia)}  converges  to  some  recurrently  connected  set.  Hence  the  limit  lim,_,  Pr  (x(t)  e 
A)  exists.  Now  we  show  that  this  limit  is  zero.  Suppose  not,  i.e.,  suppose 
lim,_,  ,,  nfit)  =  5  >  0.  Then  it  follows  that  e(t)r  T,,  A  ni(t)  =  +cc.  Hence  for 
some  j  e  A,  /3,  =  p.  But  then  by  Lemma  4,  /3*  e  3l,  which  gives  a  contradiction.  □ 

5.  Weak  reversibility  and  simulated  annealing.  We  now  turn  our  attention  to  the 
special  class  of  Markov  chains  arising  from  the  method  of  optimization  by  simulated 
annealing.  Recall  that  the  Markov  chains  in  this  class  satisfy  ( 1 )- ( 6 )  with  the  special 
choice  of 


V tl  :=  max  {0,  W ;  -  W,}. 

In  [7]  it  was  shown  that  under  the  “symmetric  neighborhood"  assumption,  c„>  0  if 
and  only  if  ch  >  0,  the  orders  of  recurrence  satisfy  the  following  detailed  order  balance: 

f}„  =  /3„  for  every  t,  j  6  X. 

It  is  easy  to  see  that  the  detailed  order  balance  above  is  equivalent  to  the  sum  of  the 
order  of  recurrence  of  a  state  and  its  cost  being  constant  on  recurrently  connected  sets. 

In  this  section  we  will  show  that  this  constancy  property  of  the  sum  of  ihe 
recurrence  order  and  cost  on  recurrently  connected  sets  continues  to  hold  under  the 
much  weaker  assumption  of  "weak  reversibility”  introduced  by  Hajek  in  [1]. 

Definition  5.  A  state  i  is  said  to  be  reachable  from  state  j  if  there  is  a  sequence 
of  states  j  =  t'o,  i],-  •  • ,  ir=  i  such  that  c(l  .,•*,,>()  for  0s  k%p-  1. 

Definition  6.  A  state  i  is  reachable  at  height  H  from  j  if  there  is  a  path  from  j 
to  i  as  in  Definition  5  for  which  Wik  §  H  for  OS  k  £ p. 

Assumption  l(Weak  Reversibility).  For  any  real  number  H  and  any  two  states 
i  and  j,  i  is  reachable  at  height  H  from  j  if  and  only  if  j  is  reachable  at  height  H  from  i. 

In  what  follows  we  assume  weak  reversibility. 

Theorem  5  (The  Potential  Theorem).  Under  Assumption  1 ,  for  every  recurrently 
connected  set  A  there  exists  a  constant  a  (A)  such  that  /3,  +  W,  =  a(  A)  for  every  i  e  A. 

Proof.  We  fix  our  attention  on  a  particular  recurrently  connected  set  A.  Assume 
to  the  contrary  that  A  can  be  partitioned  into  equipotential  sets  C,,  C\,  •  •  • ,  C,  such 
that  fi,+  W,=  a(Ck)  for  every  ie  Ck,  where  the  a(Ck)’s  are  distinct  constants.  We  will 
show  that  there  is  only  one  equipotential  set,  namely,  A. 

For  each  equipotential  set  C;,  determine  an  arc  of  maximum  /3-flow  out  of  the 
set.  From  Lemma  2,  there  exists  a  directed  cycle  of  these  equipotential  sets,  and  the 
/3-flow  along  the  directed  cycle  is  constant.  Moreover,  from  Lemma  3,  since  A  is  a 
recurrently  connected  set,  these  /3-flows  are  all  nonnegative.  Without  loss  of  generality, 
label  the  sets  along  the  directed  cycle  C,,  C2,  •  ■  ■ ,  Cr  such  that  the  constant  a(C,) 
associated  with  the  set  C,  is  smallest.  Let  ( i„j ,)  be  the  arc  of  maximum  /3-flow  out 
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of  the  set  Cs.  By  construction,  i,e  C,  and  y',e  Cll+slmodp  and 

Pi,  j,  =  PhJ,  =  •  • '  =  A„j„  B  0. 

Knowing  that  we  consider  the  two  cases:  (1)  Wh  g  IV,,;  or  (2)  W2i<  W,, . 

If  case  (1)  is  true  then  since  is  reachable  at  height  Wh  from  it,  by  the  weak 
reversibility  assumption  there  exists  a  path  from  j,  back  to  i,  that  does  not  go  through 
any  states  with  costs  larger  than  Wh.  Let  ( k,  I)  be  the  particular  arc  of  that  path  that 
exits  C2.  Note  that 

P„.j,=Ph.h 

*Pu. 

because  /3,.  is  the  arc  of  maximum  /3-flow  out  of  C2.  If  /3*;  SO  then  /3t,  =  fik  +  Wk-  W,. 
If  /3kl  <  0  then  f}k  +  Wk  -  W,  <  0.  In  either  case,  since  /3f|  y,  s  o,  we  have  that 

P,,  .J,  =  Pi,  +  ~  W1,  =  Pk+  Wk~  Wh 

Now  by  the  weak  reversibility  assumption,  WJt  g  IV,,  and  so 

/3,  +  IV(|  ^/3k+Wk-, 

that  <s, 

«(C|)  g  a(C;), 

which  is  a  contradiction. 

If  case  (2)  is  true,  then  there  is  a  path  from  j ,  to  i|  that  does  not  pass  through 
any  states  with  costs  larger  than  i, .  Again,  identify  the  particular  arc  of  that  path  that 
exits  C:  as  (k.l).  Note  that 

P„=P„.„ 

8/3*1- 

Using  similar  arguments  as  in  case  ( 1 ),  since  j 3,,  S  0  we  have  /3,,  ^  fik  +  Wk-  Wh  Now 
by  the  weak  reversibility  assumption  g  W,,  and  so  a(C,)^a(C2),  which  is  again 
a  contradiction. 

Hence  there  is  only  one  equipotential  set,  A.  0 

Since  W,  +  /},  =  a(A)  for  all  ieA,  where  A  is  a  recurrently  connected  set,  we 
obtain  the  following  necessary  and  sufficient  condition  for  simulated  annealing  to  hit 
a  global  minimum  with  probability  one  from  all  states  ieX. 

Let  M  '■=  {i  €  X:  W,  s  W,  for  all  je  X}  be  the  set  of  global  minima.  We  now  have 
the  following  definition  due  to  [1]. 

Definition  7.  Let  d*  be  the  smallest  number  with  the  property  that  for  every 
is  X  there  exists  a  path  (i  =  i„,  ■  •  ■ ,  ir)  with  cu-ilil>0  for  OS  kSp- 1  and  ending  in 
a  minimizer  ;t,  6  M  such  that 

Wu-W,sd*  for  k  =  l,  •••,/>. 

We  shall  call  d*  the  depth  of  the  minimization  problem. 

Theorem  6  (Necessary  and  Sufficient  Condition  to  Hit  Global  Minimum  With 
Probability  One).  Suppose  that  weak  reversibility  holds. 

(1)  If  £*,,  e(t)J'  =  +oo,  then  for  every  initial  condition  x(0)e  X, 

limsup-J-X  Pr(x(/)e  M)=  1, 

and  the  global  minimum  is  hit  with  probability  one. 
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(2)  If  XT=  i  <  +°o,  then  there  exists  an  initial  condition  x(0)e  X  for  which 
Pr (x(t)e  Mc for  all  fS  1)>0. 

Proof.  The  proof  is  the  same  as  in  Theorem  (4.6)  in  [7]  except  that  if  (/  = 
<o, '  ’  • ,  iP  =j)  is  a  path  from  i  to  j  with  cikJk^  >  0  and  Wik  -  W,  5  y  for  1  s  k  s  p,  then 
instead  of  using  the  reversed  path  (j-  ip,  •  •  ■ ,  i0=  i)  given  by  the  assumption  of 
symmetric  neighborhoods,  we  use  the  path  (j  =  l0,  •••,/,  =  i)  with  c,k  Jkt>  0  and 
W,k  -  W" ,  s  y  for  1  s;  k  ^  q,  guaranteed  by  the  weak  reversibility  assumption.  0 

The  same  condition  e(t)d’  =oo  has  been  shown  earlier  by  Hajek  [1]  to  be 
necessary  and  sufficient  for  lim,..*.  Pr  ( x(t )  e  M)  =  1,  i.e.,  for  convergence  in  probability. 
Thus  while  result  (1)  above  is  weaker  than  his,  since  it  involves  Cesaro  as  opposed  to 
regular  convergence,  the  result  (2)  is  a  stronger  sample  path  result. 

The  above  result  has  been  proved  earlier  in  [7]  under  the  stronger  assumption  of 
symmetric  neighborhoods,  cv  >  0 «  q,  >  0.  Moreover,  under  this  assumption  Connors 
and  Kumar  [7]  have  proved  a  detailed  balance  result  that  we  can  obtain  as  a  corollary 
of  Theorem  5,  as  we  show  below. 

Corollary  1  (Detailed  Balance).  Under  the  symmetric  neighborhood  assumption. 
Pa  =  Pi,  for  every  i,  j  6  X. 

Proof.  If  i  and  j  are  not  neighbors,  then  p,j  =  fy,  =  ~oo. 

If  i  and  j  are  neighbors  and  i  e  R  and  je  T,  where  R  is  the  set  of  recurrent  states 
and  T  is  the  set  of  transient  states,  then 

Pjk  =  -oo  for  all  k 

and  so 

-oo  =  max  pjk  =  max  pki  g  j3„, 

k  *j  k*j 

showing  that  =  /?,,  =  -oo.  A  similar  argument  holds  if  ieT  and  j e  R. 

Finally,  if  i  and  j  are  neighbors  and  i,  j  €  R,  without  loss  of  generality  let  us 
assume  that  a  Wr  Then  /3y  =  p,  SO,  and  so  i  and  j  belong  to  a  common  recurrently 
connected  set.  Hence  by  Theorem  5,  A  +  W, ,  =  Pj •+  Wj.  Since  /3„  =  P,  and  pjt  = 
Pi  +  Wj  -  Wt,  it  follows  that  p,j  =  pM.  0 

Note  that  by  the  above  results,  if  the  order  of  recurrence  of  even  one  state  in  a 
connected  component  is  known,  then  the  orders  of  recurrence  for  all  the  states  belonging 
to  the  connected  component  are  determined.  However,  as  Example  4  shows,  it  is  not 
always  possible  to  determine  the  order  of  recurrence  of  even  one  state  in  a  connected 
component  from  the  order  balance  equations  alone.  In  that  example,  the  connected 
components  of  recurrent  states  are  the  sets  {1}  and  {2},  and  the  detailed  balance 
equations  do  not  determine  the  order  of  recurrence  p2  of  the  single  state  in  the  connected 
component  {2}.  The  reason  for  this  inadequacy,  as  mentioned  earlier  in  Example  4,  is 
that  the  orders  of  recurrence  do  depend  on  the  proportionality  constants  ct/  involved 
in  the  transition  probabilities.  In  any  case,  the  )3-ffows  do  satisfy  Corollary  1. 

6.  Conclusions.  The  notion  of  order  of  recurrence  provides  a  novel  approach  for 
analyzing  the  class  of  Markov  chains  whose  transition  probabilities  are  proportional 
to  powers  of  a  time-varying  parameter  e(t).  These  recurrence  orders  satisfy  a  set  of 
balance  equations,  and  the  Markov  chain  converges  in  a  Cesaro  sense  to  the  set  of 
states  with  the  largest  recurrence  orders.  We  have  given  an  algorithm  for  generating 
a  solution  to  the  order  balance  equations  and  have  also  provided  a  method  for 
characterizing  all  solutions  to  these  equations.  The  algebraic  methods  presented  in  this 
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paper  for  solving  the  order  balance  equations  are  not  always  sufficient  for  determining 
the  recurrence  orders.  In  some  situations  where  nonunique  solutions  exist,  the  orders 
of  recurrence  can  depend  on  the  proportionality  constants  involved  in  the  transition 
probabilities,  and  not  just  on  their  orders  of  magnitude.  This  problem  remains  an  open 
issue.  The  method  of  optimization  by  simulated  annealing  falls  within  the  framework 
of  this  class  of  Markov  chains.  We  have  shown  that  if  the  Markov  process  is  weakly 
reversible,  then  the  sum  of  the  recurrence  order  and  the  cost  are  constants  on  each 
sets  of  states  connected  by  recurrent  arcs.  This  allows  us  to  determine  the  necessary 
and  sufficient  conditions  on  the  cooling  rate  for  the  optimization  algorithm  to  hit  a 
global  minimum  with  probability  one  from  all  initial  states. 
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Abstract 


We  consider  generalised  simaJated-aaaealiag  type  Markov  churns 
where  the  traasitioa  probabilities  art  proportional  to  powers  of  a 
vanishing  small  parameter.  One  can  associate  with  each  state  an 
“order  of  recurrence*  which  quantifies  the  asymptotic  behavior  of 
the  state  occupation  probability.  These  orders  of  recurrence  satisfy 
a  fundamental  balance  eqaaiian  across  every  edge- cut  in  the  graph 
of  the  Maikov  chain.  Moreover,  the  Markov  chaia  coevergea  in  a 
Ceearo- sense  to  the  set  of  states  having  the  largest  recurrence  orders. 
Ws  provide  graph  theoretic  algorithms  to  determine  the  solutions  of 
the  belaace  equations.  By  applying  them  results  to  the  problem  of 
optimisation  by  simulated  annealing,  we  show  that  the  asm  of  the 
recurrence  order  and  the  cost  is  a  constant  for  all  states  in  a  certain 
connectnd  sat,  whenever  n  ‘waak-revursibiUty*  condition  is  satisfied. 


1  Introduction 

We  consider  finite  state  Markov  chains  {*(<)}  with  transition  prob¬ 
abilities  of  the  type, 

*i(<)  - 

when  r(t)  is  a  small  parameter  con  verging  to  teto.  la  a  previous 
paper  [1]  we  have  shown  that  if  one  defines  “orders  of  recurrsncs*  by 
(more  precise  definition!  are  given  In  Section  2), 

A  «p{c  >  0:  £ «W*.(0  *  +<»}, 

M 

than 

(1)  thaaa  recurrence  orders  satisfy  a  balance  equation, 


(or  every  rabeet  A ,  and 

(Is)  the  Marhav  process  converges  to  the  eet  of  states  with  the  largest 
ordera  of  recurrence. 


This  provides  a  novel  approach  to  analysing  the  asymptotic  be¬ 
havior  of  such  time- Inhomogeneous  Markov  procmats.  Specifically, 
one  asm  (l)  to  solve  the  belaace  eq nations,  and  than  (ii)  provides 
Iks  limitiag  behavior.  Moreover  the  orders  of  recurrence  also  provide 
iafonnatkm  about  the  rates  of  coaverganca  of  the  state  occupation 
probabilities.  This  approach  via  recsnenca  orders  therefore  converts 
the  analytic  pro  Warn  of  datenainiag  the  asymptotic  behavior  of  tbs 


“This  ssasaseb  has  base  seppeslefi  la  past  bp  AFOSH  Coatract  He.  AfOSR- 
•Mill,  USA fiO  Ceeiiact  He.  DAAL-ewa-Ka#44,  sad  JSEP  CeaWtct  He. 
Ntttis-et-C-sist. 

'New  wish  fesennsieeel  SeWesst  Hechiaw  Cespeselleu,  Thewse  I.  Wei see 
Kninrh  Ceesst,  See  lit,  Yestsewa  gsljhae,  ICY  ISMS. 


tima-iahomogaaaoM  procam  into  a  partly  algebraic  problem  of  solv¬ 
ing  tha  balance  equations. 

A  significant  motivation  (or  studying  such  Markov  chains  lies  in 
tha  fact  that  in  tha  method  of  optimisation  by  simalatad  annealing,  if 
{Wj}  la  tha  coat  function  whose  minimum  la  sought,  then  one  obtains 
a  Markov  chain  with, 

»i(t)  •d'ltr*-"'-"''. 

Thus  simulated  annealing  is  a  special  case  where  the  powers  V,t  sat¬ 
isfy, 

Vit  im  max(0,  Wj  -  W,), 

for  some  {Wi}. 

Im  order  to  pursue  the  ubove  approach  for  analysing  such  tima- 
iahomoganaoua  Markov  chains,  it  is  necessary  to  be  able  to  solve  the 
balance  equations.  However,  then  can  be  non-unique  solutions  to 
the  balance  equations.  We  present  graph- theoretic  circulation  based 
algorithms  to  obtain  a  solution,  at  well  aa  si/ solutions,  to  tbt  balance 
equations.  We  show  by  aa  example  the  interesting  phenomenon  that 
such  non-aniqaeaem  can  arise  when  the  asymptotic  properties  of  the 
Markov  process,  and  the  recurrence  ordera,  depend  not  just  on  the 
exponents  Vy,  but  also  on  tha  proporiionahlf  constants  ey. 

By  applying  them  results  to  tha  Markov  chain  arising  from  tha 
method  of  op  ti  mi  cation  by  simulated  annealing  when  the  “weak  re¬ 
versibility*  condition  of  Hajek  [2]  holds,  we  show  that  tha  sum  of 
tha  recurrence  order  and  tha  coat  la  a  constant  on  sets  connected  by 
recurrent  arcs.  This  allows  ua  to  obtain  tha  necessary  and  sufficient 
condition  for  the  optimisation  algorithm  to  hit  the  global  minimum 
with  probability  oaa.  Oar  aecaasity  result  is  a  stronger  sample  path 
result  than  is  found  in  [2]  or  (3). 

Background 


TsiUiklis  [3)  baa  alto  investigated  Markov  chains  with  transition 
probabilities  proportional  to  powers  of  a  email  time-varying  parame¬ 
ter.  His  analysis  was  baaed  on  observing  that  doe  to  the  slow  varia¬ 
tion  of  {((!)},  one  can  employ  bounds  on  the  state  occupation  prob¬ 
abilities  for  stationary  Markov  chains,  where  <(i)  is  bald  constant, 
to  obtain  bounds  for  tha  time-iahomogtaaoua  case.  Hit  approach  it 
quite  different  from  ours. 

Baaed  on  aa  analogy  to  tha  physical  process  or  annealing,  the 
sequence  <(f)  is  called  the  'cooling  schedule*  and  just  at  in  the 
physical  analogy  it  plays  a  key  role  la  determining  asymptotic  be¬ 
havior.  U  was  shown  by  Gem  an  and  Gem  an  [4|,  Mitra,  Romeo  and 
Snagfovnaoi-VUcenteili  (5|,  and  Gidas  {6|,  that  simulated  annealing 
converges  in  probability  to  a  minimum  of  the  optimisation  problem 
provided  CSe‘(<)*  ■  +«  for  largi  enough  p.  Hajek  (2)  baa  deter¬ 
mined  the  a  scats  ary  and  suffideat  conditions  on  the  value  of  p  for  the 
algorithm  to  converge  in  protnUHtp  to  the  miaimam  whan  a  “weak 
ravenibiUty*  assumption  it  satisfied. 
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*2  Orders  of  Recurrence  and  Balance  Equa¬ 
tions 

I  'oaudur  a  Markov  chain  over  a  fail*  Mate  apace  X  who**  tranaitioa 
'‘probabilities  an  proportional  to  power*  of  a  vani*hing  time  varying 
parameter  <(t);  that  it,  the  tranaitioa  probabilities- p^t)  :*  Pr (*(«  + 
)  =  *  0  an  give*  by, 

*j(0  *  «ii«(0K'.  foraiii,y€  X,i  j,  tad  1 6  2+,  (1) 

when 

0  <  Vij  <  +oo,  for  all  i,j  6  X,  i  fi  j,  (2) 

4j>0,  for  aU  i,je  X,i  fij,  aad  »  i  for  all  i.  (3) 

i 

.egariiinf  the  (mail  parameter  {<(<)},  we  will  laaame  that, 

0  <  «(t)  <1,  for  ail  I  £  2*  (4) 

344  <  oo  each  that  <(t)  <,  M<(a)  whenever  (  >  a,  (5) 
ad  m 

£<(lF<oo,  for  aomep€(l,+oo).  (6) 

eel 

i  what  follow  we  wiU  aaww  that  la  (1-3)  we  have 

enea  Vy  at  +00 

-hich  ia  dearly  withoat  aay  foe*  of  generality. 

Let  t, (!)  :*  Pr(*(0  *  i)  he  the  probability  dietribatioa  of  «(<), 
-ad  »,,(!)  s  Pr (x(t)  a  •',.«(<  +  1)  *  J)  he  the  probability  of  a 
tiaiuituM  from  atate  i  to  j  at  time  t. 

We  dcfia*  the  iwarnu  order*  for  the  atate*  aad  tranaitioa*  of 
le  Markov  ptocaea,  ae  foUowa. 

Deflnitioa  3.1  The  order  of  recurrence  of  <  etele  i  6  IT,  denoted 

A  .  ie 

-00  ifESoVi(t)  <  +O0, 

a  ,  P"  if  P  ■  *«P{«  2  o :  ESo<(‘)**.(t)  ■  +oo) 

”  ■  end  E2e«(l)P*j(l)  <  +00. 

P  if* «maa(<2>:ESe «(»)•'•(»)  -+<*»)• 

./*  aay  a  atate  i  ie  IraneeeiM  if  A  *  — oo;  otberwie*  we  aay  the  elate 
i*  latawMt 

la  a  limilar  aaaaar  w*  define  the  order  of  recurrence  o/ the  Iran- 
(ten  /rem  i  i*  j,  A*,  by  nptadac  ».(0  with  *,,(/)  ia  the  defiai- 
-je  above.  Again,  wa  aay  the  traaeitioa  from  i  to  j  ie  trantienl  if 
A,  *  -oo;  otherwim  w*  aay  the  traaeitioa  ia  recurrent 
It  ie  deo  convenient  to  dedae  p,  the  order  ef  oaotanf  of  {<(!)},  ae 
Horn. 


Dedeitioa  IJ  The  order  e/the  cauhny  artedele  (*(!)},  denoted  p, 
«  defined  ae, 

(-«  if  ££•<(*)  <  +<“. 

p-  if  P  *  *ap{e  2  • ;  EJS»  t(f)*  •  +oo) 

£2*  «(*)'<+<». 

p  ifp>max{c  2  0:  ££,<(()*  *i  +«)• 

The  relalioeehip  between  A.  Ai  A  (*«*■  *■  ‘he  foUowiag 
Lemma  3.1.  It  will  he  meeenienl  ia  the  sequel  to  deiae  the  operatise 
”*•  a*  follow*: 


k  J  -oo  if «  <  * 

l  e-h  if#**. 


l|  una  2.1  A *  and  A  •»  nfolod  *f 

Ai-A6  K«,  fereUi.jtX, 
w^  r  pend  A  on  nfoted  % 


Proof:  See  (1).  ■ 

KaowUdg*  of  the  A ‘a  provide*  useful  information  about  the 
aeymptotic  propertie*  of  {>(<)}.  The  following  thnorem  thowa  that 
the  lime-inbomogvaaou*  Markov  chain  coeverga*  in  n  Cetera  aerue 
to  the  set  of  ttatea  having  the  iargtat  ordara  of  mcnmace. 

Theorem  3.1  let  Ad  he  the  act  o/  ttatea  with  the  la  rye  *4  order*  o/ 
recur  ranee, 

A4  :»  {.  6  X  :  A  -  p). 

Then 


ttmtup  2  Pr<*(0  6  X)  ■  I- 
tr— «•  " 


Proot  See  [7],  ■ 

Oar  goal  therefore  it  to  determiae  the  rtcarraact  order*,  aad  criti¬ 
cal  to  that  will  ha  tha  foUowiag  nsatt  attabliehed  ia  (1),  which  thowa 
that  then  ie  a  fundamental  balance  of  recurrence  order*  acroe*  every 
edge-cut  ia  the  graph  of  the  Markov  ckaia. 

Theorem  3.2  (Order  BeUacn) 

JXL'*'  “  taSSJt.^*  /ort“r»AS  *■  M 

Equivalent/]/,  winy  the  “9*  notation  end  (7), 

max  A©  KJ  *  max  A8**>  for  every  A  £X.  (11) 

•<4d<  *•  '**o**‘ 1 

Proot  See  (1).  I 

NoU  that  through  Theorem*  2.1  and  2.2  we  have  coavnrtad 

the  problem  cf  datarmiaiag  the  aeymptotic  propertie*  of  the  Uma- 
iahomogeaeoae  Markov  chain  into  aa  afochnaic  problem  of  wiving 
the  bnlaaca  aquation*  (10). 

3  The  Modified  Balance  Equations 

Nou  that  If  (A,0i,...,Ax|)  a*U*<y  (U).  ‘hen  (A  -  «»A  “ 

a . A*l  -  •)  ‘d*°  MOiafy  (11)  far  every  e,  ia.  the  aolutioa  net 

ia  iransfolaoe  invariant.  Thai  the  equation  (8)  which  &xm  tha  max¬ 
imum  of  the  A’*  need*  to  aleo  be  takaa  into  account. 

However,  (11, 8)  caa  together  Mill  poeeeea  non-unique  aolutioaa  for 
sufficiently  email  valum  of  p.  Ia  this  section,  wa  wiU  show  how  on* 
caa  obtain  on*  eolutfoa  to  (11,  8);  ia  tha  aaxt  section  we  show  how 
to  ohtaia  aU  eolation*. 

It  ie  convenient  to  coneider  tha  following  “modified"  balance  aqua- 
tioon,  which  ae  wt  show  ia  tha  sequel,  always  poeeeea  a  unique  eo¬ 
lation.  Given  p  >  0  and  Kj  >  0  for  i,j  »  1,. . . , |JC|  with  i  P  j, 
coaiidcr  the  problem  of  determining  A  :■  (Ai,...,A|jc|)  each  that 

m+x  A,  -Vi,  *  max  A,  -  Vu,  for  every  A  £ 

*aJZa‘  w  icajta*  '  n  x  1  ‘ 


Wt  call  (13, 13)  the  "modified”  balance  equation*.  Observe  that  (12) 
differs  bom  (11)  ia  that  the  operation  is  used  in  place  of  “©." 
Aleo,  the  A'*  caa  he  negative  ia  (12). 

Wt  have  i  a  trod*  cad  tha  modified  balance  equation*  ie  order  to 
avoid  the  diflcaltie*  la  handling  -oo  that  occur  under  the  *©*  op¬ 
eration.  The  foUoariag  theorem  give*  propertie*  of  eolutfoa*  to  the 
order  baiaac*  aad  modified  balaaca  equation* 

Thooram  3.1  1.  If  A  uiuftee  Ike  modified  beience  equations  for 

a  pauen  p  and  V,  Men  9  defined  ** 

A  ;»  A|60  (U) 

Mtifi/b*  M<  order  balance  (fuaffoM  (It,  I)  for  the  given  p  and 

V- 
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t  For  every  given  p  and  V,  then  exists  a  unique  solution  X  to  the 
modified  balance  equation*.  Moreover,  the  solution*  far  different 

mJui  of  p  an  translates  of  each  other. 

J.  Whenever  p  it  large  enough,  then  exists  a  unique  rotation  to  the 
order  balance  equations  (11,  t).  These  unique  solutions  an  all 
translates  of  the  solutions  for  the  modified  balance  equations. 

Pivot  Sac  [7j.  I 

Wa  now  giv*  as  algorithm  for  determining  the  unique  tolntion  to 
tht  modified  balance  equations.  An  illustrative  example  is  conve¬ 
nient. 

Example  3.1  Let  p  a  5  and 


We  have  previously  determined  that  As  -  At  »  I,  and  so  the  max- 
iota  ns  Is  achieved  by  Aj  —  Vu  ■  A|  -  3,  and  the  arc  at  maximum 
A-fiow  out  of  {1,4}  Is  the  vc  (1,3). 

In  a  similar  fashion,  examining  the  RHS  of  the  modified  balance 
Equation  (18),  we  determine  that  the  maximum  A-fiow  out  of  {2,3} 
is  achieved  by  Aj  -  Vis  ■  Aj  -  4,  and  so  the  arc  of  maximum  A-fiow 
ont  of  {2,3}  is  the  arc  (3,4). 

We  now  consider  the  directed  graph  Gj  a  (Vj,£j),  with  Vj  = 
{{1,4),  {2,3}}  and  £j  s  {(1,3), (3, 4)}.  Note  that  £,  is  the  set  of 
t)M  arcs  of  maxi  mam  A-flow  oat  of  the  edge  cats  in  Vy. 

Observe  that  Gy  has  a  directed  cycle  {1,4}  — •  {2,3}  — »  {1,4}. 
Now  note  that  As  -  Vij  is  the  maidmum  A-low  ont  of  {1,4}  and 
Aa  -  Vj,  is  the  maximum  A-fiow  ont  of  {2,3}  and  so 


4  3  1' 

*37 
2*4' 

<  5 

Oar  goal  is  to  determine  A  ■  (A,,. ...As)  which  satisfies  (12,  13). 
We  shall  refer  to  A,  -  Vi,  ss  the  X-flow  along  the  vc  (i,j).  Consider 
fret  the  modified  balance  equation  lot  the  edge  citi*  {•}, 

■«th-  Va  mmexhj-Vjt.  (13) 

iV*  Jpt 

Obeerve  that  the  LBS  of  (13)  can  be  written  as 


y  -  ik>i  « 


/* 

« 

V* 


At-ddyKi, 

and  eo  the  arc  of  mncMsam  X-fiore  ont  of  A  »  (i)  ie  the  vc  (i,i(i)) 

whan 

f(i)  *  sritanVij. 

IV* 

(Note  that  l(i)  may  not  bo  unique.) 

We  now  construct  the  directed  graph  Gx  a  (V,,£,),  with  V,  m 

{{1} . {<}}  «nd  (i,j)  €  Ei  if  j  •  Hi).  Note  that  G\  has  two 

directed  cycles  {1}  —  (4)  —  {1}  end  {2}  — *  {3}  —  {2}.  Let 
us  examine  the  A-fiows  on  the  directed  cycle  (1)  —  {4}  —  (1). 
Since  As  -  V,«  ie  the  maximum  A-fiow  ont  of  (1).  it  is  not  smaller 
than  any  A-low  into  {1},  and  eo  in  particular 


A,  -  VU  *  A«  -  Vs,. 

Also,  As  -  K«  in  the  etrttre  A-fiow  ont  of  {4}  and  eo 


As  -  Vet  2  A,  -Vi«. 

We  this  observe  that  the  A-fiowi  along  the  directed  cycle  {1}  — » 
{'.}  — *  {1}  an  eynaf;  Ust  is, 

A,-Vm.As-Vs.. 


and  so 


At 


-  1  -  As -2. 


(1«) 


Thus,  we  have  determined  the  difference  between  At  and  As- 
In  exactly  the  same  way,  from  the  directed  cycle  {2}  — «  {3}  — 

{2}  wn  tee  that 

A,  -  J  -  A,  -  2.  (17) 

then  determining  the  diffract  between  Ay  and  Ay. 

At  the  not  step  of  the  algorithm,  consider  the  modified  balance 
equations  foe  tbe  edge  cat  (A,  A*)  when  A  ■  {1,4}  and  A*  ■  {2,3}. 
Obeerve  that  for  A  a  (1,4),  tbe  LHS  of  the  modified  balance  eqta- 

imqiA'Xi-Vii..mqxA'Xi-Vii  (18) 

cea  be  written  ae 


that  ie. 


■••(At  ”  A'n.Ar  -  Vis,  As  -  Km,  As  -  Ve); 
no*(X\  -  4,A|  -3,  As  -4,  As  -3). 


At  -  Vis  ■  Ay  -  Vj«,‘ 


that  Is, 


Av  -  3  a  Ay  -  4. 


Combining  (16, 17, 19)  gives 


A|— 3*Ay— 3*Ay— 4aAy  —  4. 


(19) 

(20) 


We  tow  know  the  pairwise  differences  between  off  of  the  Vs,  and 
so  are  do  not  need  to  consider  any  additional  edge  cute.  To  fix  tbe 
values  of  {At},  wa  on  the  value  of  p  to  give. 


m^cAtapaS 

Since,  from  (20),  Ay  is  tbe  largest,  we  set  Ay  a  5.  We  thee  obtain 
the  solatioa  to  the  modified  balance  equations  as, 

A|  a  3,  Ay  a  5,  Ay  a  As  *  4. 


The  principal  Idea  need  to  solve  tbe  modified  balance  equations  ie 

Example  3.1  is  summarised  in  the  following  lemma. 

Iwiws  3.1 

l.  Cits n  A  Z  X  for  which  we  brew  the  pairwise  differences  between 
ail  the  X,'s  for  steies  in  A,  we  cast  determine  the  arc  of  maximum 
X-flow  out  of  A  (without  knowing  the  X,’s  themselves). 

t,  lot  At,  Ay,..., A,  he  a  partition  of  X  end  suppose  for  cock  Ay 
we  know  all  the  pairwise  differences  between  the  X^’s  for  all  states 
in  Ay.  Let  (iy,yy)  denote  the  an  of  maximum  X-flow  out  of  Ay. 
Construct  the  directed  graph  G  a  (V,E),  with  V  a  {A|,...,A») 
and  E  a  {(i|,yi)i...,(iF,Je)}.  There  exists  a  directed  cycle  on 
G.  If  {A*,,...,  A*.}  is  the  list  of  vertices,  in  order,  along  the 
directed  cycle,  then  the  X-flow  on  the  directed  cycle  is  constant, 
that  is, 

-  Vim+n  ■•••**<—- J—- 
and  we  can  determine  the  pairwise  differences  between  the  values 
of  the  Xf ’s  for  all  the  states  in  (Jj^,  Ay. 

Proof!  See  (7).  ■ 

Tbe  algorithm  for  solving  tbe  modified  balance  equations  is  out¬ 
lined  below. 

Algorithm  to  solve  modified  balance  equations 

Step  It  Set  A}  a  {1}  for  r  a  l,...,|Af|.  We  call  the  Af’a  coalitions 
at  step  k  Note  that  for  every  i,  the  pairwise  differences  between 
tbe  A- values  for  ail  states  in  A)  are  (trivially)  known.  Set  A1  :a 
(A|,  A},...,  Ajj,,).  Let  1V(1)  a  |A‘|  a:  the  nember  of  elements 
in  the  set  A1  a  lumber  of  coalitions  at  Step  1. 

Step  kt  Given  A*  :a  {Af, Aj,..., Aj^j,},  where  for  each  A*  €  A* 
the  pairwise  differences  between  all  of  tko  Vr  for  i's  in  A*  are 
known,  construct  A**1  u  follows.  Using  Lemma  3.1,  identify 
the  directed  cycles  in  the  graph.  (Then  exists  at  least  one  di¬ 
rected  cycle.)  The  eltmmU  of  A**1  consist  of  the  directed  cycles 


1498 


identified  ia  the  graph,  aad  thou  Af  6  d*  which  an  not  ia  my 
directed  cycle.  (Mora  precisely,  it  (dj,.^, . d*_)  it  »  di¬ 

rected  cycle,  the*  U“ ,  i^iiu  detent  at  d*+l).  Note  that 
for  every  d**1  €  Ak*1,  the  pairwise  dii fereacee  betweea  ail  ot 
the  V*  (or  i*e  ia  d*+l  an  knows.  Furthermore,  it  JV(4) :»  |d*|, 
thea  .V(t  +  1)  <  N(k). 

let  Step:  Stop  whea  N(k)  ■  1.  Note  that  the  pairwise  diffemces 
betweea  ail  Vs  an  kaowa,  aad  the  X  satisfying  the  modified 
balaaca  equation*  caa  be  obtaiaed  by  a  tnaslatioa  by  using  the 
givua  value  ot  p.  ■ 


An  Algorithm  to  Obtain  All  Solutions  of 
the  Order  Balance  Equations 

•  now  characterize  all  solatioas  to  the  order  balaaca  equations, 

d  describe  aa  algorithm  for  generating  all  these  solatioas.  To  do 
we  will  sse  the  coalitions  (rif }  generated  by  the  algorithm  ot  the 

reeding  section.  Let  as  call  Jt,  -  Kj  aad  ft,  ■  ft0Vi>  *•  the  X-flow 

d  0-flow,  respectively,  along  the  arc  (i,y). 

unma  4.1  /.  lf(i,j)  is  (he  are  a/  maximum  X-flow  eel  of  Af, 

then  it  is  else  tka  an  af  merimem  0-flow  eat  Af. 

I  V  (Af,...,Af)  ia  a  directed  cycle  oklamed  at  step  k,  then  the 
0-flaw  aloof  tka  directed  cycle  is  e  ewe  stent 

1.  If  tka  0-flow  aloof  tka  directed  cy rib  {d|,...,dj)  attained  at 
step  h  is  -oo,  then  tka  0-flow  aloof  any  directed  cycle  obtained 
at  step  a  >  fc  contain  my  d*  w  (£,  df  ee  a  node,  ie  also  -oo. 

f.  (t  t*«  0-0aw  alony  the  directed  cycle  {df,...,dj}  ehteiaed  at 
step  h  is  >  0,  then  /or  every  i,  j  £  Af*1  :■  Lt.,  dA  then 
exisle  a  path  (i  a  ie,i|,...,it  a  j)  rack  that  i_  £  df*1  end 
>0/er»Sm5 *-». 

roots  Set  (T|.  ■ 

Motivated  by  3)  aad  4)  above,  we  ieUodeoe  the  following  delas- 


eflnitioa  4.1  We  tkall  say  that  i  is  ncnneatly  coaaacted  to  j 
there  easts  e  path  (i  »  ie,i|,..-,i(  ■  j)  with  0imjmt,  Z  0  for 
C  m  <  y  -  1. 

M's  shall  say  that  e  set  d  £  Jf  is  e  ncomatly  coaaacted  set  i/ 
r  every  i,  j  £  d  end  h  €  d*,  i  is  rat.  errantly  connected  to  j  kat  not 
k. 


From  Lemma  4.1  it  foUoora  that  recaneatly  mnaactad  sets  an 
•cisely  those  Af’e  foe  which  the  0-low  oat  od  Af  ie  -oo,  while  the 
lows  slaeg  the  directed  cycles  coetsieed  witkia  Af  an  2;  0.  Note 
o  that  the  ncenettly  coa sorted  sets  farm  a  fortitron  od  X. 

We  sow  proceed  to  determiaa  which  sets  are  possible  cat  did  ales 
beiag  recaneatly  cosaected  seta.  Consider  a  typical  caa di date 

*'■  Let  T  dsaote  the  0-low  oa  the  cycle  {d} . d*),  when 

*  UT«,  d*.  Thea  if  (im.jm)  ie  tha  arc  od  meatman  lew  oat  od 
,  (aad,  by  coastractisa,  iato  i*fw4lleraey)i  m  mast  have, 

r  -  A,  -  KtJI  -  A.  -  K,*  - ...  -  ft,  -  Vi,*  2  0, 


max  ft  -  Vij  <  0, 


HAT'otAf" 


AS* 

<44f*' 

riilaowattempt  to  detanaiae  whether  then  exist  {0i  :■  €  d‘+<J 
i  Miisfy  them  condition*  Note  that  if  this  is  aot  fnsibie,  thea 
1  cuaeok  ha  a  racaRiatly  coaaacted  let. 
tt  (*,y)  deaote  the  arc  od  maximam  0-tow  oat  ad  d,  .  Thea 
;  K„ .  Fix  m  to  he  aa  arbitrarily  choeea  state  bom  df  .  Tana 


for  every  state  h  £  A***  we  know  the  value  of  (0* -ft,)  from  Lemma 
4.1  above.  Let  us  define 


C*  :*  0k  -  0m- 

Thea 

■  01|  “  K,J| 

*  0»  +  Cit  ~  K|ji 

*  0v  “  Cs  +  Cl|  “  I'll 

<  ^*V  “  Cs  +  Ci  *  Kisi  *»  Mu 
giving  aa  upper  bouad  oa  7. 

We  must  also  satisfy  the  coast  raiat  max  ft  <  p ,  aad  to  let 

*/»*♦" 

t  out  max  C- 

•ex?*1 

Thea  it  it  clsar  that  0»  i  mnxft.  That, 

A  1  0a 

*  0m  *(a 

*  ft, -C, +C» 

*  A»  ~  Kij,  +  K,  j,  ~  Cii  +  Cs 

*  ^  b  K,j,  “Ci, 

sad  to 

0rS/-K,jl+C,-<r»:Jdz. 

giriag  yet  saother  upper  boaad  aa  f.  (Note:  if  Af**  *  {»},  thea 
Hi  ■  miaj  aad  ids  ■  p.) 
day  choice  od  f  from  the  interval 

O(d^,):.10,idl)n[0.dd,l 

will  allow  assigumeats  for  the  ncamace  orders  od  suits  ia  Af*1 
consistent  with  the  assumption  that  the  coalitioa  Af*1  is  a  recur¬ 
rently  connected  set.  If  0(df*'>)  m  4  thee  then  is  oo  assignment, 
aad  to  Af**  is  aot  a  ncematfy  coaaacted  set. 

Ws  still  aeed  to  determine  the  set  of  all  ncumatly  canasctsd  seta. 
To  do  this  we  coos  tract  a  rooted  tree  having  the  coalitions  produced 
by  the  gsaerul  procedure  as  nodes,  aad  having  a  directed  edge  from 
coalition  d*+(  to  d*  if  dJ+‘  2  di4.  Heace,  the  root  of  the  tree  it 
X,  aad  it*  laavea  an  the  siagletoa  sets  {1}, {2}, ...,{#}.  Let  A  he 
the  set  of  the  leavee  of  the  tree  which  are  descendants  of  the  node  • 
ia  the  rooted  tree. 

We  aay  that  a  set  3  of  nodes  is  a  proper  cover  if 

U  DawX 

A*3 


aad 

DaC\Do>w0  toe  A  fl  A’. 

Now  the  algorithm  to  determiaa  all  the  solutions  of  (110)  proceeds 
at  follows.  Let  a  set  H  :*  {d|,dt,...,da}  be  a  proper  cover.  Now  ws 
will  detanaiae  whether'S  caa  be  a  set  of  ell  ncumatly  connected 
sets,  aa  follows.  First  we  determiaa  fl(dj)  far  every  A,  €  3.  If 
any  of  the  0(ri>)'s  is  empty,  than  the  guess  S  is  nol  a  feasible  set 
of  ncumatly  cosaected  sets.  If  every  Q(dj)  it  noa -empty,  thea  let 
7,  :m  max  fli(dj},  sad  for  each  A,  determine  whether  with  the  choice 
of  7j  than  is  a  state  ij  £  A,  with  ft,  *  p.  If  so  such  stale  exists  [or 
any  A,,  thus  again  3  ia  not  a  feasible  let  of  ncumatly  cosaected 
nit.  Finally,  if  than  exist  such  d/s,  then  let  d(3)  be  the  tot  of 
all  inch  d/s.  Now,  the  tat  of  all  solutions  corresponding  to  3  ia 
obtained  by  picking,  la  tsra,  aa  A,  from  AM),  *»«g  ita  low  aa  7, 
aad  chooaiag  all  othar  7/ 1  arbitrarily  from  tka  a(d>)'t.  By  dieckiag 
team  proper  cover  2,  wo  thus  datarmias  ail  solutions  to  the  order 
baiaac*  eqaatioas.  Sea  (7)  for  the  predts  proof. 

Ws  ill ast rata  the  procedure  for  determining  all  solutions  to  the 
ordsr  balaaca  eqaaliow. 
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Example  4.1  We  construct  »U  solution*  to  the  order  balance  equa¬ 
tion*  (or  Example  3.1  when  p  •  4.  We  check  the  proper  coven: 

1.  2  »  {Jf}:  fi({l,4})  in  empty,  eo  X  ennnot  be  a  recurrently 

connected  eat. 

12*  {{1,4},  {2, 3}):  Uiing  the  method  deaeribad  above  we  ob¬ 
tain 

A  3  «.  A  *  4,  A  *  3,  A  *  1  +  °, 
where  1  <  a  <  3. 

3.  2  >  {{1,4), {2}, (3)):  max  A  <  4,  a  contradiction. 

4.  2- {{1}.  {4}.  {13}}: 

fit  ■  7.  A  *  4,  A  ■  3,  A  «  *. 
where  7  *  -on  or  0  £  7  <  1,  and  #  *  -00  or  0  <  #  <  2. 

5.  2  «  {{1}.{2},{3},{4}}:  nuft  <  4,  and  to  {{1,4}, {2,3}}  U 
not  a  aet  of  reearrently  connected  eatn. 

We  have  checked  aU  proper  coven.  Rente  the  aet  of  all  eolatiooa  U 
{(«.  4,3,1  +  a)  :  1  £  a  <  3}  U  {(7.4,1#)  :7--ooor0^7< 
1  and  #  *  -00  or  0  <  #  <  2}.  ® 

How  can  aon-aaiqae  aolaliona  to  the  order  balance  equations  ante, 
and  what  ia  the  implication  of  lack  aon-aaiqaeaeee?  Tint  let  aa  coo- 
eider  the  cane  where  a  aaiqae  eolation  axiets.  Since  rack  a  eolation  in 
iniquely  determined  by  the  algorithm,  it  ia  dear  that  the  reearrence 
orden  of  the  itatea,  and  than  the  ratea  of  convergence  of  the  Ina¬ 
nition  probability,  depend  only  on  the  Vy's  ia  the  trusitioa  prob¬ 
ability  pij(t)  m  Ci,((t)v‘‘,  and  not  on  the  proportionality  conataata 
{etj}.  However,  ia  the  can  of  nan  aniyes  eolationa,  the  following 
example  abowa  that  the  reearrence  orden  may  even  depend  on  the 
the  proportionality  constant*  {ey}. 

Example  4.3  Let  *  a  {1,2,3}  and  V,,  m  max{0 ,j  -  i}.  Let  cm  * 
(u«  I  ,cj,  »  l-o  and  cnee,  where  a  6  (0,1).  Set  ey  *  0  for  all 
other  i,j.  Let  the  cooling  ached  ale  be  r(t)  a  1/1.  Then  the  complete 
aet  of  order  balance  uquatkma  obtained  by  using  aU  edge  cata  ia, 

AS^a  ■  AQVji. 

A  8  v»  ■  A  0  Via, 

m*x(A  0  Vjj.  A  0  Pu)  ■  m*x(A0Pja.A0  Pii). 
with  the  muimam  given  by, 

myA*!. 

A  *  **  A  ■  7.  A  ■  -00 

aatiafy  the  order  balance  eqaationa  liar  every  7  €  {-oo)U(0, 1).  That 
any  valve  of  A  <  1  give*  a  eolation  of  the  order  balance  eqaationa. 

However,  a  calcalathm,  which  can  be  Coead  ia  (7),  ahowa  that  tha 
correct  order  of  recerrenca  of  atate  2  ia 

A  •«. 

Thun,  the  order  of  recvrtence.  and  tha  rate  of  convergence  of  the 
probability  Pr(*(t)  a  2)  to  0,  depend#  on  the  proportionality  con- 
atant  cj*  ■  a  involved.  ■ 

Baaed  on  the  above  results,  wo  obtain  the  (allowing  property  of 
the  orden  of  recerrenca  of  tho  rtatoo  in  a  rocaneaUy  connected  aet. 

Lnranan  4.3  Can  eider  a  rote rsemfy  manner  ad  ear  4. 

/.  If  AM  /or  warn  i  C  A,  Am  A  €  ft  fm  ill  j  €  4. 


f.  tf  for  tome  i  €  4,  fi,  *  p*  / or  jome  p,  6  R,  (Acn  /or  every 
t  €  4,  A  ■  fj  for  tome  pj  €  12. 

Proof!  Thit  followv  immediately  from  the  proof  of  Lemma  4.1.  ■ 

Than  aU  recurrence  orden  ia  a  recurrently  connected  let  are  of  the 
tame  type,  i.e.  either  they  are  ail  real  number*  p,-,  or  they  are  all  of 
the  type  p”,  or  they  are  all  -00  (tee  Definition  2.1). 

Thit  give*  us  the  following  Lemma  which  completes  the  proof  of 
Theorem  2.1. 

Lomma  4  J  Suppose  the  role  of  tooting  it  p  *  p  g  72,  with  p  >  0, 
ie.  the  maximum  it  achieved  in  Definition  t.t.  If  there  it  a  itate 
ie  X  for  which  A  ■  P~,  then  lim,.^.  Prfx(r)  *  i)  *  0. 

Proof!  Suppove  4  1*  the  reearrently  connected  clam  to  which  i  be¬ 
longs.  Sine*  all  are*  between  recurrently  connected  sets  are  transient, 
it  follow*  from  the  Borei-CaateUi  Lemma  that  along  almost  every 
sample  path  u  there  can  only  be  a  finite  number  of  transitions  be¬ 
times  different  recurrently  connected  sett.  Hence  for  almost  every 
u,  {*(<,<■>)}  converge*  to  some  recurrently  connected  set.  Hence  the 
limit  Ua<_«  Pr(x{t)  €  4)  exists.  Now  we  show  that  tbit  limit  is 
0.  Suppose  tot,  i.e.  suppose  £J</(  a ,(l)  *  i  >  0.  Then 

it  follow,  that  TZeWLjeA  »,(()  *  4-00.  Hence  for  some  j  6  4, 
fij  *  p.  But  then  by  Lemma  4.2,  A  €  72,  which  gives  a  contradiction, 
thus  proving  the  lemma.  ■ 

5  Weak  Reversibility  and  Simulated  An¬ 
nealing 

Ws  now  tarn  our  attention  to  tha  special  cl  us  of  Markov  chains  aris¬ 
ing  from  tha  mathod  of  optimisation  bp  simulated  annealing.  Recall 
that  the  Markov  chain*  in  this  daaa  aatiafy  (1-4)  with  the  special 
choict  of 

Kj  :■  mi*{0,  Wy  -  Wi). 

In  [1]  it  wns  shown  that  under  the  ‘symmetric  neighborhood”  as¬ 
sumption,  Cj,-  >  0  if  and  only  if  e*  >  0,  the  orders  of  recurrence 
satisfy  the  following  detailed  order  balance, 

fin*  fin  for  every  i,j  6  X. 

It  ii  easy  to  see  that  the  detailed  order  balance  above  is  equivalent 
to  the  ram  of  the  order  of  recurrence  of  a  state  and  its  coat  being 
constant  on  recurrently  connected  sets. 

Ia  this  section  we  will  show  that  this  constancy  property  of  the 
sura  of  the  recurrence  order  and  cost  on  recurently  connected  sets 
continues  to  hold  under  the  much  weaker  assumption  of  “weak  re¬ 
versibility”  introdeced  by  Hajek  in  [2]. 

Definition  1. 1  4  state  i  ia  mid  to  he  reachable  from  stale  j  if  there 
it  a  tegvence  of  itatei  j  m  i0,iI,...,ip  *  i  tech  that  ej4(j,4(  >  0  for 
0Sk<p-l. 


Definition  (.3  4  itate  i  it  reachable  el  height  H  from  j  if  then  it  a 
path  from  j  to  i  at  in  Definition  5.1  for  which  W,,  <  H  for  0  <  k  <  p. 


Assumption  1.1  (Weak  Rnvwrsibiiity)  For  any  real  number  H 
and  any  two  stale*  i  and  j,  i  it  reachable  et  height  11  from  j  if  and 
o  nip  if  j  it  reachable  at  height  H  from  i, 

to  what  follows  sm  seism*  weak  reversibility. 

Theorem  1.1  (Tho  Potential  Theorem)  Under 
Anmmption  5,1,  for  carry  rmrrmttp  connected  act  4  there  etitU 
e  constant  o(4)  such  that  A4-W*  o{4)  for  ever p  i  6  4. 
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Proof.  See  [7J.  ■ 

Since  Wi  +  A  *  o<  A)  fat  all  i  C  A,  whan  A  is  a  recurrently  con¬ 
nected  tat,  aw  obtain  tba  following  neceteary  and  sufficient  condition 
Cor  simulated  annealing  to  bit  a  /total  minimum  with  probability  oaa 
from  all  itaiaa  i  €  X. 

Lat  M  :*  {»  €  X  :  Vfi  S  W*  (dr  all  j  €  X}  bn  tba  aat  of  global 
minima  Wa  now  have  tba  fallowing  definition  due  to  (2]. 

Definition  IJ  Let  d*  be  the  imaliaat  number  milk  tin  property  that 
/or  every  i  £  X  thorn  exult  a  path  (i  *  i,)  mth  «■,>»*,  >  0 

/or  0  <  *  <  p  and  ending  in  a  minimuer  if  €  14  tuck  that 

Wif-WiS*  Jorkml . . 

Wo  shall  call  f  the  depth  of  the  minimisation  proilea. 

Tbooram  IJ  Suppata  that  noth  ntertshslily  holds 
I.  ffVT,  <flW*  *  +00,  Ue»  /or  every  Mia/  condition  *(0)  €  Jf, 

Umtap-jrVPr(*(0  6  Ad) »  1, 

and  the  /total  mrmntam  it  kit  with  probability  ana. 

t.  IJ  ££,  ((tK*  <  +00,  then  then  txuU  an  initial  condition 
x(0)  €  X  for  whack, 

Pr(z(t)  €  hi*  for  a/i  <  >  l)  >  0. 


Proof  The  proof  la  tba  mm  aa  in  Theorem  (44)  in  [1]  except  that 
t/ (i«ie,. /)  la  a  path  from  t  toy  with  <t,  (1  :  0  and  VP;,  - 

W,  £  y  for  1  £  h  £  p,  than  inataad  of  wing  the  reverted  path  (j  m 

if . it  *  i)  0«an  by  tba  aaaamption  of  eyour  stria  neighborhoods, 

oaa  aaaa  tba  path  (j  a  I#,...,  I,  a  i)  with  >  0  and  IV|,  -  VPt  S 

7  far  1  S  *  <  ♦,  guaranteed  by  tba  wen1:  reversibility  aaaumplioa.  ■ 

Tba  fame  condition  ££,  «(</*"  *  oo  baa  bean  aariiar  aboam  by 
llajab  (2)  to  be  aacaaairy  and  sufficient  far  Urn,  Pr(x(r)  €  if)  a 
1,  in.  far  convergence  in  probe kihrp  Tba*  while  raaalt  1)  above 
is  weaker  than  bit  tinea  it  iavolvaa  Caaaio  aa  oppoaad  to  regular 
can varganca,  tba  raaalt  2)  la  a  ttroagar  temple  path  raaalt. 

it  ia  alao  worth  noting  that  if  one  additionally  assume*  tba  property 
of  symmetric  neighborhoods,  Cjj  >  0  mm  Cji  >  0,  than  tba  detailed 
balance  raaalt  of  (l)fallowa  ana  corollary  of  Theorem  3.1,  a*  we  show 
below. 

Corollary  1.1  (Dwtailad  Balance)  Under  the  symmetric  neigh- 
| issvmptiom, 

fits  •  f>H  hr  eeetf  ij  £  X. 

Proof:  Sea  [7].  ■ 

Note  that  by  tba  above  raaalta ,  if  tba  order  of  recurrence  of  even 
aw  data  ia  a  connected  component  ia  known,  than  tba  order*  of 
murraaca  far  all  tba  ataiaa  belonging  to  tba  connected  component 
err  determined.  However,  ae  Example  4.2  shows,  it  ia  not  alwaya 
p.wMble  to  determine  the  order  of  recurrence  of  even  ooe  atata  ia  a 
i  faceted  component.  In  that  example,  the  connected  component*  of 
iw  arrant  tlalae  are  the  ante  (1)  and  (3).  We  do  not  know  the  order 
<4  recurrence  of  tba  smgfa  male  ia  the  connected  component  (2), 
oitbwei  taking  into  account  the  proportionality  conet  ante  involved 
.a  Ike  inanition  peobahillllae.  Than,  far  thia  example  tba  detailed 
beleace  eqealioos  are  not  eafldent  far  determining  A-  However, 
a<4e  that  the  0-fiow»  do  tatiafy  Corollary  5.L 


6  Conclusions 

The  notion  of  order  of  recurrence  providaa  a  novel  approach  for  an- 
alytiag  tba  darn  of  Markov  ckaiaa  wboee  traaaitioa  probabilitiea  era 
proportional  to  powen  of  n  tiote- varying  parameter  c(t).  Theae  re¬ 
currence  order*  aatiefy  a  art  of  balance  equation*,  and  the  Markov 
chain  converge*  ia  a  Ceearo  tease  to  the  net  of  state*  with  tba  Urgant 
recurrence  order*.  W*  have  given  an  algorithm  far  generating  a  toiu- 
tioo  to  the  order  balance  equation*  and  have  alao  provided  a  method 
far  characterizing  all  eoiutioaa  to  theae  aquation*.  In  tome  equa¬ 
tion*  where  non- unique  solution*  axial,  the  order*  of  recurrence  can 
depend  on  the  proportionality  constant*  involved  in  tha  (rantitioa 
probabilitiea,  and  not  just  on  their  order*  of  magnitude.  Thia  prob¬ 
lem  remain*  an  open  iatue. 

The  method  of  optimization  by  simulated  annealing  fall*  within  the 
framework  of  tki*  clam  of  Markov  ckaiaa.  W#  have  shown  that  if  the 
Markov  process  ia  weakly  reversible,  than  tba  sum  of  the  recurrence 
order  and  tba  coat  ia  a  constant  on  each  aat  of  slate*  connected 
by  recurrent  arc*.  Thin  allow*  an  to  determine  the  necessary  and 
sufficient  on  the  cooling  rale  far  the  optimization  algorithm  to  kit  a 
global  minimem  with  probability  one  from  all  initial  slate*. 
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ABSTRACT 

Several  important  problems  in  diverse  application  areas  such  as  image  restoration,  code  design,  and  VLSI  design,  contain  at 
their  core  an  optimization  problem  whose  solution  crucially  determines  the  performance  of  live  resulting  engineering  system. 
Standard  descent  algorithms  for  such  optimization  problems,  however,  typically  get  trapped  in  local  minima,  and  fail  to  reach 
solutions  at  or  near  the  global  minimum.  Motivated  by  the  problems  of  determining  the  global  minima  of  optimization  prob¬ 
lems,  the  algorithm  of  simulated  annealing  for  optimization  has  been  proposed.  Here  we  present  recent  results  on  the  perfor¬ 
mance  of  this  algorithm  in  reaching  the  global  minimum  of  combinatorial  optimization  problems. 

1.  INTRODUCTION 

I  A  -J 

Several  important  application  areas  as  diverse  as  image  restoration,  code  design,  and  VLSI  design,  require  at  their  core 
the  solution  of  an  optimization  problem,  typically  combinatorial  optimization  problems.  It  is  for  this  reason  that  the  subject 
of  combinatorial  optimization  has  attracted  much  attention,  e.g.,  in  recent  years. 

However,  such  combinatorial  optimization  problems  possess  a  large  number  of  local  minima,  and  standard  descent  schemes 
for  solving  them  typically  get  stuck  in  such  local  minima,  and  fail  to  reach  solutions  at  or  near  the  global  minimum.  A  good 
illustration  of  this  fact  can  be  found  in3  for  the  well-known  traveling  salesman  problem,  and  this  is  one  of  the  prime  reasons 
why  such  problems  are  intrinsically  hard.  Indeed  for  the  traveling  salesman  problem,  which  is  one  of  the  most  well  known 
examples  of  “intractable”  problems,  no  non-trivial  choice  of  neighborhood  structure  can  eliminate  the  possibility  of 
existence  of  local  minima. 

Motivated  by  the  critical  need  to  solve  such  problems,  the  algorithm  for  optimization  by  simulated  annealing  was  proposed 
in  .  It  is  inspired  by  the  problem  of  growing  crystals  in  statistical  mechanics,  where  “annealing”  is  the  process  by  which  a 
solid  is  initially  heated  to  a  high  temperature,  and  then  cooled  so  slowly  that  it  settles  into  a  crystalline  state  corresponding  to 
a  global  minimum  of  the  energy  state.  The  cooling  needs  to  be  slow,  since  too  rapid  a  cooling  schedule  traps  the  solid  in 
higher  energy  local  minima  which  can  correspond  to  defects  in  the  crystalline  structure. 

By  an  analogy  with  the  physical  process  of  cooling  which  can  attain  states  near  the  global  minima,  see  also,  the  simulated 
annealing  procedure  for  optimization  is  a  Monte-Carlo  algorithm  which  is  a  slight,  but  important,  modification  of  descent 
algorithms.  It  occasionally,  and  randomly,  accepts  uphill  moves,  in  addition  to  always  accepting  downhill  moves.  The 
parameter  governing  the  acceptance  of  uphill  moves,  is  analogous  to  the  “temperature”  and  it  is  gradually  reduced  to  zero. 
The  object  of  such  a  scheme  is  that  at  high  “temperature”  the  algorithm  will  escape  local  minima,  and  then  slowly  evolve 
into  a  pure  descent  scheme  which  seeks  out  a  global  minimum. 

Being  inspired  by  statistical  physics,  motivated  by  the  solution  of  engineering  problems,  and  posing  several  mathematical 
questions,  this  algorithm  has  attracted  the  attention  of  physicists,  engineers  and  mathematicians  alike. 

In  the  rest  of  this  paper  we  will  present  some  key  results  that  have  been  obtained  on  the  performance  of  this  algorithm,  as 
well  as,  an  open  issue  on  which  more  research  is  needed. 
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2.  THE  SIMULATED  ANNEALING  ALGORITHM 

Let  X  be  a  finite  set,  and  let  W:  X-+R  be  a  given  cost  function  on  X,  The  goal  is  to  minimize  W(x)  over  x  e  X. 

Corresponding  to  each  state  i  e  X,  let  NicX  be  a  deleted  neighbothood  of  X,  with  i  4  X.  Let  (qy:  i  e  X,  j  €  H)  be  such  that 
qjj20  and  Jj.N.qy  =  1.  Finally,  let  0<e(t)<l  be  a  cooling  schedule  with  lim,^ e (t)  =  0.  For  simplicity,  one  may  also 
assume  that  (e(t))  is  monotone  decreasing. 

Consider  the  Maikov  chain  (x(t)}  on  X  with  transition  probabilities  defined  by 

Pti(t)  =  qiie(t)IW/"WJ*  forjeNi 
=  1  - forj®i 

The  simulated  annealing  procedure  consists  of  moving  through  the  state  space  X  according  to  this  Markov  chain. 

Essentially,  the  scheme  consists  of  two  steps  at  each  iteration.  Suppose  that  at  an  iteration  t,  x(t)  =  i.  Then  one  chooses  a 
neighbor  j  randomly  from  Ni  according  to  the  probabilities  qy.  If  WjSWi,  then  j  is  accepted  and  x(t+l)  is  set  to  j.  Thus 
downhill  moves  are  readily  accepted.  On  the  other  hand,  if  Wj  >  Wj,  then  j  is  accepted  with  probability  e  (t)W|'w‘  and  rejected 
with  probability  1— €  (t)w~  *.  If  j  is  accepted  then  x(t+l)  is  set  to  j;  otherwise  if  j  is  rejected  then  x(t+l)  remains  at  i. 

Thus  the  scheme  is  seen  to  be  a  simple  modification  of  standard  descent  algorithms.  The  parameter  e  (t)  is  the  analog  of  tem¬ 
perature. 

In  an  application  such  as  the  traveling  salesman  problem,  X  will  denote  the  set  of  all  tours.  A  neighborhood  structure  can  be 
imposed  by  deleting  two  arcs  in  the  tour  and  replacing  them  with  two  other  arcs;  see5  for  examples. 

3.  SIMULATED  ANNEALING  TYPE  MARKOV  CHAINS 

More  generally,  simulated  annealing  gives  rise  to  a  time  inhomogeneous  Markov  chain  over  a  finite  state  space  X  with  transi¬ 
tion  probabilities  given  by: 

Pi)0)=tf,E(t)v'  if  je  N. 


where  Vy  £  0.  If  itj(t)  denotes  the  probability  of  occupying  state  i  at  time  t,  then  the  goal  is  to  determine  the  asymptotic 
behavior  of  {x(t)J  as  well  as  (iti(t)). 

4.  RECURRENCE  ORDERS  AND  BALANCE  EQUATIONS 

8 

In  we  have  shown  that  one  may  analyze  the  asymptotic  behavior  of  such  Markov  chains  by  examining  quantities  which  wc 
call  "recurrence  orders:”  Let  us  define  (ft :  ie  X}  by 

ft:*-»  if|j*i(0<+"» 


>sup(clc^Oand^c(t)eKi(t)<+«»)  otherwise. 


If  the  supremum  above  is  not  attained,  we  will  denote  ft  by  er  rather  than  c.  Let  us  also  define 
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I 

:=  ^  -  Vu  otherwise. 

O 

We  shall  call  ft  as  the  recurrence  order  of  stale  i,  and  Pg  as  the  recurrence  order  of  the  transition  from  i  to  j  (see  for  more 
precise  details). 

.  8 

The  following  fundamental  result  was  obtained  in  . 
j  Theorem:  Balance  of  Recurrence  Orders 

For  every  A  cX, 

It  is  worth  noting  that  this  balance  equation  differs  fundamentally  from  traditional  balance  equations  which  represent  balance 
of  flow  between  two  spatially  separate  sets  in  equilibrium.  In  contrast,  our  balance  equation  is  for  a  process  which  is  not  in 
equilibrium;  moreover  it  is  a  balance  in  “lime."  - 

The  advantage  of  this  balance  equation  is  that  it  converts  the  difficult  analytical  problem  of  determining  the  asymptotic 
behavior  of  a  time  -  inhomogeneous  stochastic  process  into  a  purely  algebraic  problem  of  solving  the  balance  equations, 
o 

In  we  have  obtained  circulation  based  graph  theoretic  algorithms  to  solve  these  balance  equations. 

It  has  also  been  shown  that  the  Markov  process  converges  in  a  Cesaro- sense  to  the  set  of  states  with  the  largest  recurrence 
orders;  see  .  Thus,  the  solution  of  the  algebraic  problem  gives  the  asymptotic  behavior. 

It  should  be  noted  that  Tsitsiklis1®  has  also  investigated  such  general  Markov  chains.  His  approach  which  essentially  obtains 
bounds  on  the  state  occupation  probabilities  for  time-invariant  Maikov  chains,  and  then  employs  them  for  time- 
inhomogeneous  chains  sampled  over  long  time  intervals,  is  quite  different  from  ours. 

^■-j^PLICATIQN-TpSI.MU^ATEDj^NNEAleO^Q 

Simulated  annealing  corresponds  to  the  special  case  where  Vy  =  [Wj-WJ+.  For  simplicity  let  us  suppose  that  the  neighbor¬ 
hood  structure  is  symmetric,  i.e.,  i  e  Nj  if  and  only  if  j  e  N;. 

a 

Then  we  have  obtained  a  considerably  stronger  statement  of  ’detailed  balance’,  see  . 

Theorem:  Detailed  Balance  for  Simulated  Annealing 

=  for  all  j,i. 

8 

Using  this  result  we  have  obtained  in  the  necessary  and  sufficient  condition  on  the  cooling  rate  of  (e(t))  for  simulated 
annealing  to  hit  the  global  minimum  with  probability  one  starting  from  all  states.  It  is  necessary  to  introduce  the  notion  of 
“depth"  of  an  optimization  problem. 

Definition:  Depth  of  an  optimisation  problem 

Let  d  be  the  smallest  number  such  that  for  every  i  €  X,  there  exists  a  path  i  =  io.it . i«<,>  with  i^  a  global  minimizer  of  W, 

satisfying  i^i  e  for  k  * 0, 1, . . . , n(i}-l,  and  such  that  W(ik}-W(i)  S  d  for  k  =  0, 1 . n(i)— 1. 


■4 


Essentially,  the  depth  measures  the  deepness  of  local  minima. 

g 

In  we  have  proved  the  following  Theorem. 
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Theorem:  Necessary  and  Sufficient  Conditions  to  Hit  a  Global  Minimum 

pj  x(t))  hits  a  global  minimum  for  some  t^oj  si 
if  and  only  if  ^  e  (t)4  =  +«. 


Earlier,  it  has  been  shown  in  *  that  a  similar  condition  is  necessary  for  the  simulated  annealing  Markov  chain  to  converge  in 
probability  to  the  set  of  global  minmizers.  Cur  proof  of  necessity  of  this  condition  to  guarantee  ever  hitting  the  global 
minimum  is  a  stronger  sample  path  statement 

o 

In  we  have^also  generalized  this  result  to  Markov  chains  which  do  not  satisfy  a  symmetry  condition,  but  satisfy  instead  what 
is  called  in1 1  a  “weak  reversibility  condition.’* 


6.  CONCLUDING  REMARKS 

At  this  stage  we  have  a  good  understanding  of  the  time  vs.  temperature  asymptotics  of  simulated  annealing.  It  is  of  consider¬ 
able  interest  to  study  the  asymptotic  behavior  of  the  simulated  annealing  algorithm  as  the  size  of  problem  instances  grows, 
much  as  in  the  theory  of  computational  complexity.  The  results  obtained  can  be  used  to  measure  the  complexity  of  the  algo¬ 
rithm  in  probabilistic  terms. 
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treated  as  a  function  of  sknewness  alone,  is  asymptotically  tighter  than  the  Jerrum  and 
Sinclair  bound.  We  also  show  that  our  bound  is,  in  general,  much  easier  to  compute  for 
SA  chains. 
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ABSTRACT 

This  paper  presents  a  novel  upper  bound  for  the  second  largest  eigenvalue  of  a  finite  reversible 
time-homogeneous  Markov  chain  as  a  function  of  three  parameters,  namely,  the  smallest  transition  pro¬ 
bability,  the  underlying  structure  of  the  chain,  and  the  skewness  of  the  equilibrium  distribution.  This 
eigenvalue  bound  enables  us  to  bound  the  time  constant  of  convergence  of  a  reversible  Markov  chain 
to  its  equilibrium  distribution.  Simulated  Annealing  (SA),  is  an  example  of  a  probabilistic  algorithm 
that  is  widely  used  for  solving  combinatorial  optimization  problems,  wherein  the  transition  probabilities 
are  controlled  by  a  certain  temperature  parameter  T> 0.  The  behavior  of  SA  at  a  fixed  temperature 
T>0  can  be  modeled  by  a  reversible  time-homogeneous  Markov  chain  converging  to  an  equilibrium 
distribution  at  that  temperature.  As  the  temperature  T -*0,  the  equlibrium  distributions  themselves  con¬ 
verge  to  the  optimal  distribution.  Using  the  results  of  this  paper,  we  can  not  only  bound  the  time  con¬ 
stant  of  convergence  of  SA  to  equilibrium  at  any  arbitrarily  small  but  fixed  temperature  T >0,  but  also 
study  the  growth  of  this  bound  as  T  — K);  thereby,  providing  a  fairly  good  understanding  of  the  tem¬ 
perature  asymptotics  of  the  simulated  annealing  algorithm.  The  eigenvalue  bound  of  this  paper  is 
compared  with  the  bound  derived  by  Jemim  and  Sinclair  in  [4].  We  exhibit  a  class  of  Markov  chains 
on  which  our  bound,  treated  as  a  function  of  skewness  alone,  is  asymptotically  tighter  than  the  Jemim 
and  Sinclair  bound.  We  also  show  that  our  bound  is,  in  general,  much  easier  to  compute  for  SA 
chains. 


t  This  work  «u  supported  in  pen  by  i  grant  from  the  Semiconductor  Research  Corporation  under  contract  #  SRC  86-12-109,  and  in 
part  by  a  gram  from  the  United  Slates  Air  Force  Office  of  Scientific  Research  under  contract  *  AFOSR  88-0181. 
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1.  INTRODUCTION 

Let  ft  =  {1,2 . N)  be  a  discrete  state  space,  and  consider  a  time-homogeneous  Markov  chain 

( X(k)  )  on  ft  with  an  NxN  probability  transition  matrix  P  =  [p<;]  such  that  for  any  i,  ;eft,  and  time 

*20, 

Pij  =  Prob  ( X(k+\)=j  |  X(k)=i  )  (1.1) 

Let  v(*)  =  [v,(*)]  be  die  IxN  distribution  vector  describing  the  chain  at  time  *  such  that 

v,(*)  =P'ob(X(ky=iy,  it  follows  that  v(*+l)  =  v(*)  P . 

Suppose  the  Maikov  chain  converges  to  an  equilibrium  distribution  vector  n,  i.e., 

lim  v(*)  =  n  =  jc  P.  (j.2) 

In  this  paper,  we  are  primarily  interested  in  the  speed  of  convergence  of  v(*)  to  7t.  Let 
1=Xi2|Xj|2  •  •  •  2|XW |  denote  the  eigenvalues  of  P  arranged  in  descending  order  of  magnitude.  It  is 
then  well  known  [1]  that  the  error  at  time  *  can  be  bounded  by 

||v(*)-jc||  £  An  1^1*  (1-3) 

where  1 1  •  1 1  is  any  lp  norm  and  AN  is  a  constant  independent  of  time  * .  For  our  purposes  it  suffices 

to  consider  only  the  norm.  One  can  rewrite  (1.3)  in  the  form  |  |v(*)  -  tc|  |  <  AN  e~klx  where 

x  =  -(log  |  Mr1  (1.4) 

is  said  to  be  the  time  constant  of  convergence  of  the  Markov  chain  to  its  equilibrium  distribution.  It 

follows  that  if  IXal^l-l/^  for  some  q~>  1,  then  x^q .  Furthermore,  given  any  0<5<1  we  will  have 

|  |v(*)  -  it  1 1  S  6  whenever  *  >  [  log (AN)  +  log(l/5)  ]  x.  Therefore,  the  rate  at  which  the  Markov 

chain  achieves  equilibrium  is  determined  by  the  time  constant  x  and  hence  by  the  eigenvalue  of  second 

largest  magnitude  X?. 
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The  main  result  of  this  paper  is  the  derivation  of  an  upper  bound  on  the  eigenvalue  of  second 
largest  magnitude  of  a  reversible  Markov  chain.  In  his  remarkable  paper  [3],  Alon  established  the 
relationship  between  the  second  smallest  eigenvalue  ^(Q )  of  the  Laplacian  matrix  Q  of  a  graph  G , 
and  a  certain  expansion  parameter  c(G)  of  the  graph.  A  direct  application  of  his  ideas  to  Markov 
chains  leads  to  a  useful  bound  only  for  the  case  of  symmetric  Markov  chains  as  shown  in  [2].  A  sym¬ 
metric  Markov  chain,  however,  can  only  have  the  uniform  equilibrium  distribution,  namely,  7t,=l IN  for 
all  iefl.  In  this  paper  we  seek  a  useful  bound  for  reversible  Markov  chains  which,  in  general,  could 
have  non-uniform  equilibrium  distributions. 

The  bound  derived  in  this  paper  is  of  the  form  \Ki\<,\-Mq,  where  q  is  related  to  the  minimum 
non-zero  off-diagonal  entry  in  P ,  the  skewness  of  its  equilibrium  distribution  vector  (a  measure  of  the 
non-uniformity  of  the  distribution  defined  by  (2.1)  in  Section  2).  and  p2(G)-  Recently,  Jerrum  and 
Sinclair  [4]  have  derived  an  alternate  bound  of  the  form  |S1— <f>2/2,  where  0  is  a  certain  conductance 
parameter  associated  with  the  reversible  Markov  chain  which  is  an  extension  of  the  expansion  idea  for 
edge-weighted  graphs.  We  compare  the  two  bounds  and  exhibit  a  class  of  Markov  chains  for  which 
our  bound,  treated  as  a  function  of  skewness  alone,  is  asymptotically  tighter  than  the  Jerrum  and  Sin¬ 
clair  bound. 

Reversible  Markov  chains  are  of  interest  because  they  can  be  used  to  model  stochastic  algorithms 
for  combinatorial  optimization  such  as  Simulated  Annealing  (SA)  [6].  As  an  application  of  our  results, 
we  will  consider  using  SA  at  a  fixed  temperature  to  solve  some  specific  combinatorial  optimization 
problems  and  derive  bounds  on  the  time  constant  of  convergence  of  such  chains. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2,  we  establish  some  notation  and 
definitions,  and  present  some  basic  results  from  Linear  Algebra  and  Non-negative  matrices  that  are 
required  for  the  rest  of  the  paper.  A  new  upper  bound  for  the  second  largest  eigenvalue  of  a  reversible 
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transition  matrix  is  presented  in  Section  3.  The  SA  algorithm  is  briefly  described  in  Section  4  fol¬ 
lowed  by  a  discussion  of  the  temperature  asymptotics  of  the  corresponding  reversible  Markov  chains. 
A  comparison  between  our  new  bound  and  that  derived  by  Jerrum  and  Sinclair  is  also  made  along  with 
an  analysis  of  the  time  constant  of  convergence  of  such  chains.  Finally,  in  Section  S,  we  summarize 


our  conclusions. 
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2.  PRELIMINARIES  AND  DEFINITIONS 

We  study  a  time-homogeneous  Markov  chain  (X(k))  on  a  finite  state  space  £I={1,2,  .  .  .  ,N) 
with  transition  matrix  P  =  (p^J.  We  begin  by  reviewing  some  basic  material  on  nonnegative  matrices 
in  general.  In  this  paper  we  are  using  the  standard  graph-theoretic  terminology  from  [5]. 

Definition  2.1  :  The  underlying  directed  graph  of  P  is  a  directed  graph  Gd(V ,  Ed)  with  venex  set 
V=£2,  and  an  arc  (i ,j )  directed  from  vertex  i  to  vertex  j  if  and  only  if  pij* 0.  The  matrix  P  is  irredu¬ 
cible  if  there  exists  a  directed  path  from  each  vertex  to  every  other  vertex  in  its  underlying  directed 
graph  Gd.  For  an  irreducible  matrix,  let  r  denote  the  greatest  common  divisor  of  the  lengths  of  all  the 
directed  cycles  in  its  underlying  directed  graph.  If  r=l  the  matrix  is  said  to  be  primitive. 

A  primitive  matrix  P  also  has  the  property  that  there  exists  an  integer  m >0  such  that  Pm  has  all 
strictly  positive  entries.  The  Markov  chain  itself  is  said  to  be  irreducible  (primitive)  if  its  transition 
matrix  P  is  irreducible  (primitive).  Some  authors  refer  to  irreducible  Markov  chains  as  ergodic  chains, 
and  to  primitive  chains  as  regular  chains.  We  summarize  some  basic  facts  in  the  following  theorem 
from  the  Perron-Frobenius  theory  of  nonnegative  matrices. 

Theorem  22  :  [1]  Consider  an  irreducible  Markov  chain  with  transition  matrix  P  and  distribution 
vector  v(Jk).  Then 

(1)  A^l  is  the  largest  eigenvalue  of  P .  Moreover,  1  is  a  simple  eigenvalue. 

(2)  Let  it  be  the  left  eigenvector  corresponding  to  the  eigenvalue  1  of  P,  i.e.,  k  =  nP ,  satisfying 

N 

£jc,=1.  Then  >  0  for  all  iefi.  Furthermore,  any  right  eigenvector  x  corresponding  to  any 

««i 

N 

other  eigenvalue  A<1  of  P  must  be  orthogonal  to  jc,  i.e.,  Xrc, =  0. 

i-l 
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(3)  Let  y  be  the  right  eigenvector  corresponding  to  the  eigenvalue  1  of  P ,  i.e.,  y  =  Py,  satisfying 

N 

£Yi=L  Then  y,  =  1 W  for  all  reft.  Furthennore,  a  left  eigenvector  corresponding  to  any  other 

i«l 

eigenvalue  of  P  must  be  orthogonal  to  y. 

(4)  Given  any  starting  distribution  vector  v(0),  the  distribution  vector  v(k)  at  time  k  converges  in 

l*-t 

Cesoro  sense  to  «  defined  in  (2)  as  £-*»,  i.e.,  lim—  V  v (/ )  =  n.  If,  however,  the  Markov 

*— **/to 

chain  is  primitive,  then  v(k)  actually  converges  to  Jt  in  the  regular  sense,  i.e.,  limv(jfc)  =  xc. 

k—**> 

The  left  eigenvector  jc  defined  in  (2)  above  is  called  the  equilibrium  distribution  vector  of  the 
irreducible  Markov  chain.  It  must  be  noted  that  from  (4)  above  convergence  to  the  equilibrium  vector 
is  guaranteed  only  for  primitive  chains.  For  general  irreducible  chains,  convergence  occurs  only  in  a 
weak  Cesaro  sense. 


Definition  23  :  Consider  a  Markov  chain  with  a  structuralfy-symmetric  transition  matrix  P ,  i.e.,  p^> 0 
if  and  only  if  pjL> 0.  Its  underlying  undirected  graph  is  a  simple  undirected  graph  G(V,  E)  obtained 
from  the  underlying  directed  graph  Gd(V ,  Ed)  by  deleting  all  self-loops  and  replacing  directed  2-cycles 
by  simple  edges.  Thus,  arcs  (i,j)  and  (J  ,i )  in  Gd  are  replaced  by  a  single  edge  [i,j }  in  G. 


Definition  2.4  :  For  an  irreducible  Markov  chain  with  equilibrium  distribution  7t,  we  define  the  skew¬ 
ness  sn  of  the  chain  to  be 


*; 

s%  =  max  — 
i j«n  Kj 


(2.1) 


Qearly,  s,  for  an  irreducible  Markov  chain  is  well  defined  since  such  a  chain  has  Jt,>0  for  each  ieQ 
firom  part  (2)  of  Theorem  2.2.  The  main  result  of  this  paper  deals  with  reversible  Markov  chains. 


6 


which  we  now  define  as  follows. 

Definition  23  :  An  irreducible  Markov  chain  with  transition  matrix  P  and  equilibrium  distribution 
vector  it  is  said  to  be  reversible  if,  for  all  i,jeCl,  we  have 

Pij  *«  =  Pji  Kj  (2.2) 

A  reversible  Markov  chain  has  the  following  interesting  property.  The  proof  is  an  easy  conse¬ 
quence  of  the  discussion  above,  and  is  therefore  omitted. 

Proposition  2.6  :  Consider  a  reversible  Markov  chain  with  transition  matrix  P  and  equilibrium  distri¬ 
bution  vector  jc.  Define  d,  =  VicT  for  each  ieQ,  and  the  diagonal  matrix  D  =  diag[  dud2<  •  •  •  ,dN]. 
Then, 

(i)  D2  P  is  a  symmetric  matrix. 

(ii)  D  P~D~l  is  a  symmetric  matrix. 

(iii)  Consequently,  P  is  diagonalizable  and  has  real  eigenvalues. 

In  general,  for  any  KxK  matrix  M  with  real  eigenvalues,  let  \l(M)2X2(.M)>  ■  ■  ■  >XK(M)  denote  the 
eigenvalues  of  M  arranged  in  descending  order.  Thus,  )  denotes  the  largest  eigenvalue,  X2(M) 
denotes  the  second  largest  eigenvalue  etc.  Using  this  notation,  Theorem  2.2,  and  the  above  proposi¬ 
tion,  it  is  clear  thct  for  a  transition  matrix  P  of  a  reversible  Markov  chain  we  have 

U\l(P)>X2(P)Z\3(.P)2  }>\N(P).  (2.3) 

There  are  several  symmetric  matrices  associated  with  undirected  graphs.  For  this  paper  it  suffices  to 
consider  only  one  of  them. 


Definition  2.7  :  Given  a  simple  undirected  graph  G(V  £)  on  N  vertices  (i.e.,  no  self  loops  and  no 
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multiple  edges).  Let  deg  O' )  denote  the  degree  of  vertex  ieV  which  is  the  total  number  of  edges  in  E 
incident  on  the  vertex  i.  Then  the  Laplacian  matrix  Q  (G )  is  an  NxN  matrix  with  entries  defined  as 


deg(i)  if  j=i 
•  -1  if  [i,j  }e£. 
0  otherwise 


(2.4) 


Clearly,  the  Laplacian  matrix  Q(G)  is  a  symmetric  matrix.  The  following  theorem  (stated  without 
proof)  provides  some  more  information  about  Q  ( G ). 


Theorem  2.8  :  If  G  (V,£)  is  a  connected  simple  graph  with  a  Laplacian  matrix  Q ,  then 

(1)  QL  -  Q.  where  1  is  a  vector  with  each  entry  =  1.  Hence,  Q  has  an  eigenvalue  0  with  eigenvec¬ 
tor  1.  Moreover,  0  is  a  simple  eigenvalue  of  Q ,  i.e.,  rank {Q )  =  N-l. 

(2)  The  quadratic  form  xT Qx  =  £  Oc,  -  xf) 

(ij)€E 

(3)  There  exists  a  N-\xN  matrix  B  of  full  rank  such  that  Q  ~  BT B . 

In  general,  for  any  KxK  matrix  M  with  real  eigenvalues  let  p.1(M)<p2(M)<  •  •  •  <(%  (M )  denote 
the  eigenvalues  of  M  arranged  in  ascending  order.  Thus,  Pj(M)  denotes  the  smallest  eigenvalue, 
p2(M)  denotes  the  second  smallest  eigenvalue  etc.  From  Theorem  2.8,  we  have  Q  is  positive  semi- 
definite,  and  has  eigenvalues 

0=Hi(Q  X^Q  )S  •  •  •  S \iN  {Q ).  (2.5) 

The  following  results  will  prove  useful  in  deriving  our  eigenvalue  bound  in  the  next  sectioa 

Lemma  2.9  (Min-max  principle  [13])  :  If  A  and  B  are  any  two  symmetric  K  xK  matrices  such  that 
A-B  is  positive  semi-definite,  then  for  each  i  =  1,2, . . . ,  K,  we  have  p,  (B)  £  p,  (A ). 


8 

Lemma  2.10  :  Let  B  be  any  N-lxN  matrix  of  full  rank.  Then,  for  each  i  =  1,2,  .  .  .  ,N- 1,  we  have 
=  V.l+l(BTB). 

Consequently,  the  smallest  eigenvalue  of  BBr  is  the  second  smallest  eigenvalue  of  BTB ,  the  second 
smallest  eigenvalue  of  BBT  is  the  third  smallest  eigenvalue  of  BTB ,  and  so  on.  We  use  these  to  prove 
the  next  result. 

Theorem  2.11  :  Let  Q  be  any  NxN  symmetric  and  positive  semi-definite  matrix  with 
rank(Q)  =  N- 1,  E  be  a  NxN  diagonal  matrix  with  strictly  positive  diagonal  entries,  and  omm>0 
denote  the  smallest  diagonal  entry  in  E.  Then, 

(1)  The  NxN  matrix  EQE  is  symmetric  and  positive  semi-definite. 

(2)  Also.  p2(E£  E)  >  a^2(Q  )• 

Proof  :  The  proof  of  (1)  is  obvious.  To  prove  (2),  use  Equation  (2.4)  to  write  Q  =  BTB ,  where 
B  is  an  N-lxN  matrix  of  full  rank.  Define  C  =  B  E.  Clearly  C  is  also  of  full  rank  since  E  is  a 
diagonal  matrix  with  strictly  positive  diagonal  entries.  Also,  EQE  =  I£T B  E  =  CT C .  Therefore, 
by  Lemma  2.10, 

p2(EGE)  =  n2(CrC)  =  Hi(CCT)  (2.6) 

But  CCT  =  BZ2BT .  Also,  for  any  vector xe Rw_1,  the  quadratic  form 

xT{CCT  -  oZjBBT)x  =  2  (o^-aijy,2  >  0.  (2.7) 

<-i 

where  we  have  defined  y  =  BTx.  Therefore  the  matrix  CCT  -  a2^BT  is  positive  semi-definite 
by  definition;  hence,  by  Lemma  2.9,  we  conclude  that 

\Li(CCTy2oLvx(BBT) 

Applying  Lemma  2.10  once  again,  we  get 


(2.8) 
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Hi(BBT)  =  \i2(BtB)  =  \l2(Q) 
Combining  (2.6),  (2.8),  and  (2.9)  proves  this  theorem. 
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3.  A  NEW  EIGENVALUE  BOUND  FOR  REVERSIBLE  MARKOV  CHAINS 

A  reversible  Maikov  chain  has  a  structurally-symmetric  transition  matrix  P ,  and  hence  has  an 
underlying  undirected  graph  G  which  is  both  connected  and  simple.  Furthermore,  Proposition  2.6  says 
that  P  has  the  second  largest  eigenvalue  A^cl.  The  main  result  of  this  paper  is  to  obtain  a  tighter 
upper  bound  for  of  P .  This  bound  will  be  expressed  in  terms  of  the  following  quantities: 

(1)  a  =  min{  pi;  :  i*j  ,  p,j> 0  },  the  smallest  non-zero  off-diagonal  entry  in  P , 

(2)  sn  =  the  skewness  of  the  equilibrium  distribution  it  of  P ,  and 

(3)  l±2(Q )  =  the  second  smallest  eigenvalue  of  the  Laplacian  matrix  Q  of  the  underlying  undirected 
graph  G (V,£)  of  P. 


Theorem  3.1  :  Let  Q  =  {1,2,  •  •  •  JV],  and  consider  a  reversible  Maikov  chain  on  the  state  space  Q 
with  transition  matrix  P,  and  equilibrium  distribution  n.  Also,  let  a,  sK,  and  p2(G)  be  as  defined 
above.  If  X  <  1  is  any  eigenvalue  of  P,  then 


X  <  1  - 


«  H2 (g) 

sn 


(3.1) 


Proof  :  Let  dt  =  Vrt7  for  each  ieQ,  and  define  the  NxN  diagonal  matrix 
D  -  diag  [  di.  d2,  ■  ■  ■  ,  dN  ).  Since  P  is  irreducible,  re,  >  0  for  each  re Q  from  part  (2)  of  Theorem 
2.2.  Therefore  d,> 0,  D  is  invertible,  and  D~l  =  diag  [  dfl ,  ••  •  ,  d[J]  ]. 

Let  X  <  1  be  any  eigenvalue  of  P  and  let  xeHN  be  the  corresponding  right  eigenvector,  i.e., 
Px  =  he.  Therefore,  xTD2(I-P)x  =  (1  -X)xTD2x,  which  can  be  written  as 


1-X  = 


xT(P2-W)x 

xtD2x 


(3.2) 


where  we  have  defined  the  matrix  W  =  D2P ,  i.e., 
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Wij  —  dj.  pij  —  iij  pij  (3.3) 

The  reversibility  condition  of  (2.2)  implies  that  W  is  symmetric.  Also,  the  irreducibility  of  P  implies 

that  WT  =  D2P 1  =  £)21,  by  Theorem  2.2  part  (1).  Therefore,  for  each  i  =  1,2, ...  ,N  we  have 

N 

ft.  =  Zwy  (3.4) 

Now,  consider  the  quadratic  form  in  the  numerator  of  (3.2)  which  can  be  written  as 

s 

xt(D2  -  W)x  =  X  (k—w^x2 
i=l 

Using  (3.4)  we  get 

xt(D2  -  W)x  -  X  X  wij  (x2  -  x,Xj)  f36N 

i-l  jm  i  v  ' 

j*i 

Now  consider  G(V,  E)  the  underlying  undirected  graph  of  P.  This  is  also  the  underlying  graph  of  W , 
since  for  j*i,  we  have  piy*0  if  and  only  if  >v,y*0.  Also,  w(y  =  w;i  since  W  is  symmetric.  Hence, 
(3.6)  can  be  written  as 


N  N 

Z  Z  wij  xi  xj 

i=i  j* \ 

i*i 


(3.5) 


where 


xt(D2  -  W)x  =  £  w,y  {Xi-Xjf  >  P  X  (*i~xj)Z- 


(3.7) 


P  =  min  (  Wij :  [i,j)eE  }.  (3.8) 

denotes  the  smallest  non-zero  off-diagonal  entry  in  W.  Define  =  max  jc,- ,  =  min  re,,  and 

ieO  jc  fi 

a  =  min  {  pij:  [i,j)eE  ).  (3.9) 

as  the  smallest  non-zero  off-diagonal  entry  in  P .  Since,  by  definition,  w,j  -  KiPij,  we  immediately  get 

PSarc™,  (3.10) 

Applying  Theorem  2.8  part  (2)  to  the  right  hand  side  of  (3.7)  and  using  (3.10)  we  get 

xT(D2 -W)x  ^anmilixTQx  (3.11) 

where  Q  is  the  Laplacian  matrix  associated  with  the  underlying  graph  G.  Combining  (3.2)  and  (3.11) 
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results  in 


i-x  *  **^4%- 

x1  D2x 


(3.12) 


It  must  be  noted  that  a:  is  a  right  eigenvector  of  P  with  eigenvalue  X<1,  while  it  =  1 T D2  is  a  left 
eigenvector  of  P  with  eigenvalue  1.  Theorem  2.2  part  (2)  immediately  shows  that  x  and  iz  must  be 
orthogonal,  i.e.,  1 TD2x  =  0.  So  consider  the  following  constrained  optimization  problem  : 

Minimize  zT Qz  over  all  ze  RN 
such  that  Ir D2z  =  0  and  zT  D2z  =  1. 


Setting  y  =  Dz  or  z  =  D  -1y ,  the  problem  becomes  equivalent  to 
Minimize  yT  D~xQD~xy  over  all  ye R" 
such  that  1T Dy  =  0  and  yTy  -  1. 

Recall  from  Section  2,  that  Q  is  a  symmetric  positive  semi-definite  matrix  with  eigenvalues 
0=\ix(Q )<M-2(<2 Also,  i  is  an  eigenvector  of  Q  for  eigenvalue  \sx(Q)  =  0.  Theorem 
2.11  part  (1)  shows  that  D~lQD~l  is  also  a  symmetric  positive  semi-definite  matrix,  by  treating 
I  =  D-1  Moreover,  D~lQD~x  D\  =  Q,  i.e.,  D 1  is  an  eigenvector  of  D~XQD~]  with  eigenvalue  0. 
Therefore,  the  above  optimization  problem  is  to  minimize  the  quadratic  form  yT D~lQD~ly  over  all 
normalized  vectors  ye RN  that  are  orthogonal  to  D 1,  the  eigenvector  corresponding  to  the  smallest 
eigenvalue  0  of  the  matrix  D^xQD~l.  From  quadratic  programming  theory  [14],  the  minimum  value  of 
the  quadratic  form  is  clearly  \i2(D~lQD~l),  the  second  smallest  eigenvalue  of  D~lQD~l.  Therefore, 

T 

x‘  D2x 

Applying  Theorem  2.11  part  (2)  to  the  right  hand  side  of  (3.13),  with  Z  =  D~x,  gives 


(3.13) 
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\i2{D-xQD-l)>—\s4Q) 
So  finally,  combining  (3.12),  (3.13),  and  (3.14),  we  get 


thus  proving  the  theorem.  □ 


1_X  s  a  •£==!■  P2(G) 


a  MgtQ) 

Sit 


(3.14) 


(3.32) 


For  some  graphs  G,  the  second  smallest  eigenvalue  (j.2((2  (G ))  is  easy  to  compute  analytically. 
Two  examples  are  given  below. 

Cycle  graphs:  If  G  is  a  simple-cycle  on  N  vertices,  then  the  eigenvalues  of  its  Laplacian  matrix  Q 
can  be  shown  to  be  [10] 

Hi  02)  =  2(1  -  cos(2jt(i-1)/AO)  (3.15) 

for  each  i  =  1,2,  •  •  •  JV.  Consequently,  p2(£2)  =  2(1  -  cos(2nJN))  which  approaches  0  as  N  — 

Hypercube  graphs:  If  G  is  an  n  -dimensional  hypercube  having  N  =  2"  vertices,  then  its  Laplacian 
matrix  Q  has  n+1  distinct  eigenvalues  [11]  given  by 

£,m=2 m  ;  m  =0,1,2 . n  (3.16) 

with  eigenvalue  2m  having  an  algebraic  multiplicity  j^j.  Consequently,  the  second  smallest  eigen¬ 
value  p2(Q )  =  2  which  is  independent  of  N ,  the  size  of  the  matrix. 


For  graphs  G  in  which  jx2(Q  (G ))  is  not  easy  to  compute,  one  can  use  a  lower  bound  derived  by 
Alon  [3]  given  below.  This  bound  requires  a  certain  expansion  parameter  of  the  undirected  graph  G 
which  we  now  define  as  follows. 

Definition  3.2  :  Let  G(V,E )  be  an  undirected  graph.  If  ScV  is  any  subset  of  vertices  in  G,  we 
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define  the  deleted  neighborhood  Nbd{S)  to  be  the  set  of  vertices  in  VS  which  are  joined  to  some 
vertex  in  S  by  an  edge  in  E . 

Definition  3.3  :  The  expansion  parameter  c(G)  of  an  undirected  graph  G(V,  E)  is  defined  as 

c(G)  =  min-^pJ-  (3.17) 

where  the  minimization  is  performed  over  all  subsets  ScV  such  that  0<|S  |. 


Theorem  3J  :  (3]  Let  G(V,  E)  be  graph  with  Laplacian  matrix  Q  and  expansion  parameter  c.  If 
(j.  >  0  is  any  eigenvalue  of  Q ,  then 


(3.18) 
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4.  APPLICATIONS  OF  THE  EIGENVALUE  BOUND 

As  an  application  of  the  results  of  Section  3,  we  consider  the  Simulated  Annealing  (SA)  algo¬ 
rithm.  This  algorithm  was  first  proposed  as  a  probabilistic  algorithm  for  solving  difficult  combina¬ 
torial  optimization  problems  [6].  It  has  been  used  with  some  success  in  problems  such  as  VLSI  lay¬ 
out  optimization,  the  design  of  FIR  filters  with  finite  precision,  and  image  restoration. 

We  describe  the  SA  algorithm  briefly.  Let  fl  =  { 1,  •  •  •  JV }  be  a  set  of  states  with  a  cost  func¬ 
tion  C:  Cl  ->  R.  The  SA  algorithm  attempts  to  find  a  state  with  globally  minimum  cost.  Let  x(k) 
denote  the  state  of  the  algorithm  at  time  k.  With  each  state  iefi,  we  associate  a  set  of  neighboring 
states  N,  cfl,  which  satisfy  the  following  assumptions: 

(4.1)  The  neighboring  sets  are  symmetric;  that  is,  j e  N,  if  and  only  if  i  e  /V, . 

(4.2)  Given  any  two  states  i  and  j  in  Q,  there  exists  a  finite  sequence  of  states  i0,i  lt  •  •  •  ,im  such 
that  ijpt ,  im=j,  and  t)+1eNJ(,  for  each  1=0,1,  •  ■  •  /n-1.  This  condition  is  often  referred  to  as 
the  reachability  requirement 

To  simply  matters  we  make  an  additional  assumptions  which  is  satisfied  in  most  applications. 

(4.3)  |A1,  |  =  p  for  each  ieQ,  i.e.,  all  neighbor  sets  are  of  the  same  size. 

Suppose  that  the  present  state  is  x(k)=i .  The  algorithm  then  randomly  picks  a  state  j  e  V,  with 
probability  1/|N,  |.  If  C(j)  <,  C (i),  it  sets  the  next  state  to  be  x(k+ 1)  =  j.  However,  if  C(j)>  C  (i), 
it  sets  the  next  state  to  be  x(fc+l)  =  j  with  probability  p  =  ^  *(£+1)  =  ,•  wjth  proba¬ 

bility  l-p.  In  other  words,  if  C(J)  >  C(i),  then  the  algorithm  accepts  j  as  the  next  state  with  proba¬ 
bility  p  or  remains  in  the  present  state  i  with  probability  l-p.  The  parameter  T> 0  plays  the  analo¬ 
gous  role  of  temperature  in  the  physical  annnealing  process.  We  define 


e  *  e~VI  (4.4) 

to  simplify  notation.  Note  that  T  =  0oge~1)-1.  So,  if  (XT  <+<»  then  0<e<l.  Also,  as  T  -»0  we  have 

£— *0. 


The  SA  algorithm  thus  simulates  a  time-homogeneous  Markov  chain  on  state  space  Q  with  transition 
matrix  P  -  {  ptj  ]  with  off-diagonal  entries  (i*j  )  given  by 

Pij  = 

where  [z  ]+  denotes  the  positive  part  of  a  real  number  z,  i.e.,  [z]+=z  if  z>0,  and  (z]+=0  if  z<0.  The 
diagonal  entries  of  P  are  given  by 


1  elcOMW  if  jeN. 
P 

0  if  jen-Ni 


(4.5) 


Pu  ~  1-2  Pij 

j*i 


(4.6) 


It  must  be  emphasized  that  we  have  assumed  a  fixed  temperature  T >0  for  all  time  k  of  the  SA 
algorithm.  This  is  often  referred  to  as  Fixed-Temperature-Simulated-Annealing  OFTSA)  as  opposed  to 
a  situation  wherein  the  temperature  is  allowed  to  vary  with  time  k  according  to  a  prespecified  cooling 
schedule  (see  [7,8,9]  for  details)  which  results  in  a  time-inhomogeneous  Markov  chain.  In  this  paper, 
however  we  focus  only  on  the  FTSA  algorithm. 

It  is  easy  to  check  that  that  the  assumptions  (4.1)  and  (4.2)  on  the  neighboring  sets  result  in  P 
being  primitive  and  structurally  symmetric  for  any  0<e<1.  Furthermore,  with  assumption  (4.3),  the 
equilibrium  distribution  vector  jt(e)  can  be  shown  (see  [9])  to  have  entries 

PC(i) 


«,(€)  = 


s 

£eC(/> 

i-v 


(4.7) 


which  is  often  called  the  Boltzmann  distribution  at  temperature  T  =  (loge'1)'1.  Using  (4.5),  and  (4.7), 


one  can  easily  verify  that  the  FTSA  Markov  chain  is  reversible. 
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Let  S*cft  denote  the  set  of  all  global  minima.  The  optimal  distribution  vector  it, * 

with  entries  defined  as  jc =  0  if  i  is  not  a  global  minimum,  and  7t  *  =  -■ 1  —  if  i  is  a  global 

j5*| 


is  a  vector 
minimum. 


Temperature  Asymptotics  :  For  the  FTSA  chain,  it  is  clear  from  (4.7)  that 

lim|  |jc(e)  -  Jt*||  =  0  (4.8) 

i.e.,  the  equilibrium  distribution  Jt(e)  approaches  the  optimal  distribution  as  e— >0.  For  a  chosen  0<e<1, 
let  vt(k)  denote  the  distribution  vector  of  the  FTSA  chain  at  time  k> 0  as  defined  in  Section  1.  From 
Theorem  2.2  part  (4)  we  have 

UmJ|ve(*)-rc(e)||  =  0  (4.9) 

Hence,  given  any  arbitrary  real  5>0,  from  (4.8)  and  (4.9)  there  is  an  e>0  and  a  time  k0,  such  that  the 
distribution  vector  of  FTSA  (at  the  chosen  e)  satisfies 

l|v£(*)-JC*||  <  8  (4.10) 

for  all  time  k>Jt0. 

In  this  section  we  are  primarily  interested  in  the  rate  of  convergence  of  (4.9)  as  a  function  of 
e— »0.  We  refer  to  this  as  the  temperature  asymptotics  of  FTSA.  From  the  discussion  in  Section  1, 
it  is  clear  that  for  a  particular  e>0.  the  rate  of  convergence  of  (4.9)  is  governed  by  the  time-constant  of 
convergence  x,  defined  by  (1.4),  of  the  FTSA  Markov  chain  with  transition  matrix  P.  Using  (1.4)  and 
(3.1)  we  now  derive  a  bound  for  x  and  study  the  behavior  of  this  bound  as  e— >0. 

From  Theorem  3.1  we  can  obtain  an  upper  bound  on  the  eigenvalue  of  the  transition  P  with 
second  largest  algebraic  value.  However,  to  obtain  a  meaningful  bound  on  x,  the  time-constant  of 
convergence,  we  need  an  upper  bound  on  the  eigenvalue  of  P  of  second  largest  magnitude.  To  this 
end,  we  consider  a  new  Markov  chain  corresponding  to  the  matrix 
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P  ='A(I+P).  (4.11) 

Clearly,  P  has  non-negative  eigenvalues;  thus,  the  algebraic  value  and  the  magnitude  of  an  eigenvalue 

of  P  are  the  same.  Also,  P  has  the  same  equilibrium  distribution  as  P .  Furthermore,  the  off-diagonal 

entries  of  P  are  half  the  corresponding  entries  of  P ;  hence,  P  is  also  reversible  by  (2.2)  and  has  the 

same  underlying  undirected  graph  as  P.  We  will  therefore  work  with  the  new  P  instead  of  P . 

Let  us  now  relate  the  parameters  used  in  the  bound  of  Theorem  3.1  to  the  parameters  of  the 
optimization  problem  being  solved  by  an  FTSA  Markov  chain.  Define 

A=  max  |  C(i)  —  CO)  I  (4.12) 

as  the  maximum  cost  difference  between  any  two  states.  Let  G(VJl)  be  the  underlying  undirected 
graph  of  P  (or  P)  with  Laplacian  matrix  Q  and  define 


5=  <  — ?Xr.  I  C(i)  -C(J)  I  (4.13) 

as  the  maximum  difference  in  costs  between  any  two  neighboring  states  in  the  Markov  chain.  Then, 
from  (4.7)  it  follows  that  the  skewness  of  the  chain  is  given  by 


sK  —  £■ 


(4.14) 


The  smallest  non-zero  off-diagonal  entry  of  P  can  be  computed  from  (4.5),  (4.11),  and  (4.13)  to  be 


e 

2p 


a  =  —  (4.15) 


where  p  is  the  number  of  neighboring  states  for  each  state  as  given  by  Assumption  (4.3).  Using  (3.1), 
(4.14),  and  (4.15)  we  get 

H2(G)e(A+i) 


0  £  XiiP)  £  1  - 


2p 


(4.16) 


from  which  one  can  bound  the  time-constant  for  convergence  for  sufficiently  small  e  using  (1.4) 

x  £  - ^2 - 

Ma(C) 


(4.17) 
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as  a  function  of  e  or  using  (4.14) 


as  a  function  of  the  skewness  sK. 
of  (3.42)  to  get 


x  £ 


2p  s^A) 


(4.18) 


In  case  is  not  easily  computable,  one  could  use  Alon’s  result 


X  £  4(l+2c"2)p  j«(1+&a)  (4.19) 

where  c  is  the  expansion  parameter  of  the  graph  G.  Since  8<A  by  definition,  one  can  get  a  less 

stringent  bound  as 


x  <  4  (1+2C2)  p  s«  (4.20) 

For  a  fixed  optimization  problem  (i.e.,  fixed  N,  p,  c,  etc.),  (4.20)  suggests  that  the  time-constant 
for  convergence  of  the  FTSA  Markov  chain  to  its  equilibrium  distribution  with  skewness  sn  is 
x  =  0(sK2).  In  practice,  usually  5<eA  which  yields  x  =  0(s*).  Furthermore,  the  bound  in  (3.1)  may 
not  be  tight  suggesting  an  even  slower  growth  of  x  as  a  function  of  the  skewness  sn. 


We  now  provide  an  example  of  a  cost  distribution  on  a  state  space  for  which  the  eigenvalue 
bound  of  (3.1)  for  the  FTSA  transition  matrix  is  the  best  possible  bound  when  treated  as  a  function  of 
skewness  alone.  We  will  also  compare  our  bound  with  that  of  Jerrum  and  Sinclair  [4]  for  this  exam¬ 
ple.  To  this  end  we  need  the  following  definitions. 


Definition  4.1  :  [4]  Given  a  reversible  Markov  chain  on  state  space  Q  with  transition  matrix  P  and 
equilibrium  distribution  n.  The  conductance  parameter  is  defined  as 

Z  Pijn> 

<!>(/»)  =  min‘g5i'^~s - 

Z^i 

i«S 

where  the  above  minimization  is  performed  over  all  subsets  S  of  states  with  Oc^7i,-£l/2. 

icS 


(4.21) 
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Theorem  4.2  :  [4]  For  a  reversible  transition  matrix  P  satisfying  pu  2  \H  for  all  ieQ,  we  have 

1  -  24 *P)  Z  HP)  *  1  -  (4.22) 

Example  43  :  Consider  a  simple  cycle  on  N=4n  vertices  as  the  underlying  graph  of  a  FTSA  Markov 
chain  with  a  cost  function  defined  as  follows  : 


C(t) 


i  if  l£i£/i 
2n+l-i  if  n+\<d<2n 
i-2n  if  2n+\<ii<3n 
4«+l-i  if  3n+\<i<4n 


(4.23) 


Using  these  costs,  p  =  2,  and  some  e>0,  define  the  transition  matrix  P  using  (4.5)  and  (4.6)  and  set 
P  =  lA(/  +P).  For  transition  matrix  P  it  can  be  shown  that 


A  =  n-1  ,5=1 

(4.24) 

a  =  4  ,  skewness  s  - 

4 

(4.25) 

HQ)  =  2(1  -  cos(4L)) 

2  n 

(4.26) 

II 

(4.27) 

Thus,  our  bound  from  (3.1)  gives 

1  _  ^(P)  2  (1  -  cos(JL)) 

2  In 

while  the  Jerrum  and  Sinclair  bound  from  (4.22)  gives 

(4.28) 

-  P2*-2 

(4.29) 

for  sufficiently  small  e  and  large  n . 
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For  a  fixed  n>2  (i.e.,  a  fixed  problem),  it  is  clear  that  our  bound  in  (4.28)  is  superior  to  the  Jerrum  and 
Sinclair  bound  in  (4.29)  for 

0  <  e  <  {8(1  -  cos(Ji/2n))}1/("-2> 

and  the  bound  in  (4.28)  gets  even  better  as  e-*0.  Using  the  lower  bound  in  (4.22)  and  our  bound  in 
(4.28)  and  we  get  bounds  for  the  time-constant  for  convergence  to  equilibrium  as  a  function  of  skew¬ 
ness  s  rather  than  e  as 


2  5  <  x  <  - - - s(i+i/(»-1» 

1  -  cos(jc/2rt ) 


(4.30) 


For  example,  if  we  consider  n=ll,  (4.30)  reduces  to 

2  s  <  x  <  196.5  s11  (4.31) 

indicating  that  our  upper  bound  for  the  time-constant  x  is  a  fairly  tight  bound  for  large  skewness  s  (or 

small  e). 

The  purpose  of  Example  4.3  was  merely  to  illustrate  an  example  of  a  reversible  Markov  chain  for 
which  the  eigenvalue  bound  (hence,  a  bound  on  the  rate  of  convergence)  is  fairly  tight.  The 
corresponding  optimization  problem,  however,  is  very  easy,  since,  by  construction,  the  states  1,  2 n, 
2n+l,  and  4n  have  the  globally  minimum  cost  of  1.  The  following  example  illustrates  a  difficult  and 
more  realistic  optimization  problem  for  which  one  can  still  use  our  eigenvalue  bound  of  (3.1)  to  obtain 
a  meaningful  bound  the  time-constant  of  convergence  of  the  corresponding  FTSA  Markov  chain. 
Estimating  the  conductance  parameter  for  this  chain,  however,  is  not  straight  forward;  hence,  the  Jer¬ 
rum  and  Sinclair  eigenvalue  bound  of  (4.22)  is  not  directly  useful  in  obtaining  a  meaningful  bound  for 
the  time-constant  in  this  case.  However,  with  considerable  ingenuity,  Jerrum  and  Sinclair  have  been 
successful  in  obtaining  good  lower  bounds  for  the  conductance  of  certain  classes  of  reversible  chains 
[4],  Indeed,  for  these  chains,  the  conductance  is  much  larger  than  O  (s*-1);  hence,  our  upper-bound  by 
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(3.1)  is  not  tight  in  this  case.  Our  bound,  on  the  other  hand  is  very  simple  to  compute  in  general,  as 
the  following  example  will  demonstrate,  and  is  also  tight  on  certain  chains  (as  considered  in  Example 
4.3). 

Example  4.4  :  Let  {a  •  •  •  <an }  be  a  set  of  given  positive  integers  n  ascending  order  and  define 

O  i+fl  •  •  '  +fl„ 

K  =  — — t - —  (4.32) 

2 

Let  Q  denote  the  state  space  of  all  binary  vectors  of  length  n  and  consider  a  state  u={u  hu2 . u„) 

where  «,  e  {0,1 }.  Define  the  cost  of  the  state  as 

C(u)=  \K  -  |  (4.33) 

i-i 

Define  the  neighbors  of  a  state  u  as  all  states  differing  from  u  in  exactly  one  bit.  Consider  an  FTSA 
algorithm  to  find  the  state  of  minimum  cost.  This  is  the  optimization  version  of  the  SET_PARTITION 
problem  that  is  known  to  be  NP-CompIete  [12].  Clearly,  N  =2",  p  =  n,  8  =  an,  A  -  K,  skewness 
j  =  e* ,  and  the  underlying  graph  is  the  n  -dimensional  hypercube.  Therefore,  \i2(Q  )  =  2.  Using 
(4.16)  we  immediately  get 

ff+a. 

^(/S)  £  1-1 -  (4.34) 

n 

and  from  (4.18)  we  have  a  bound  for  the  time-constant  in  terms  of  the  skewness  s  as 

X  <  n  s'~-'K  (4.35) 

For  example,  if  the  given  integers  are  (3,5,6,11,15),  we  have  n  =  5,  an  =  15,  and  K  =  20.  From 

(4.35),  the  time-constant  for  an  FTSA  algorithm  to  solve  the  given  instance  to  reach  an  equilibrium 
distribution  of  skewness  s  =  104  is  bounded  above  by  x  £  5xl07  iterations. 
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5.  CONCLUSIONS 

In  this  paper  we  have  derived  a  new  upper  bound  on  the  second  largest  eigenvalue  of  a  reversible 
Markov  chain.  The  bound  is  a  simple  function  of  the  skewness  of  the  equilibrium  distribution  of  the 
chain  and  we  give  examples  of  reversible  chains  where  the  upper  bound  is  fairly  tight.  The  upper 
bound  on  the  eigenvalue  enables  us  to  study  the  time  constant  of  convergence  of  the  Markov  chain  to 
its  equilibrium  distribution.  In  particular,  we  can  bound  the  time  constant  of  convergence  of  a  fixed 
temperature  simulated  annealing  (FTSA)  algorithm  solving  a  particular  instance  of  an  optimization 
problem.  Moreover,  we  can  study  the  growth  of  this  bound  as  the  temperature  approaches  zero  or 
skewness  becomes  arbitrarily  large. 
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